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CELL CYCLE REGULATORY PROIEINS CDC-16. DP-l. DP-2 AND E2F FROM PLANTO 

This application claims the benefit of U.S. Provisional Application 
No. 60/081.132, filed April 9. 1998. 

nELD OF THF INVENTION 

This invention is in the field of plant molecular biology. More specifically this 
mvenuon pertains to nucleic acid fragments encoding cell cycle regulatory proteins in plants 
and seeds. 

BACKGRQIlNn OF THE INVFKnry rt^r 
Cells divide by duplicating dieir chromosomes and segregating one copy of each 
duplicated chromosome, as well as providing essential organelles, to each of two daughter 
cells. Regulation of cell division is critical for the normal developmem of multicellular 
organisms. A cell that is destined to grow and divide must pass through specific phases of a 
ce cycle: G,. S (period of DNA synthesis), G^. and M (mitosis). Studies have shown that 
cell division is controlled via the regulation of two critical events during the cell cycle- 
imtiation of DNA synthesis and the initiation of mitosis. Several kinase proteins control cell 
cycle progression tiirough Uiese events. These protein kinases are heterodimeric proteins 
having a cyclin-dependem kinase (Cdks) subunit and a cyclin subunit that provides the ' 
regulatory specificity to the heterodimeric protein. These heterodimeric proteins regulate 
cell cycle by interacting wiUi proteins involved in the initiation of DNA synthesis and 
mitosis and phosphorylating them at specific regulatory sites, activating some and 
inactivating others. The cyclin subunit concentration varies in phase with cell cycle while 
the concentration of the Cdks remain relatively constam throughout the cell cycle 

In mammalian cells transcription factor E2F genes encode a family of proteins that 
bind to the nucleotide sequence TTTCGCGC and regulate die expression of various cellular 
and viral promoters. Another tr^rfption factor family of proteins, DP- 1 and DP-2 can 
form heterodimers with E2F proteins in vivo (Helin. K. et al. (1994) Genes Dev 
/0:1850-1861: Ivey-HoyI M. et al. (1993) Mol. Cell Biol. 7i(2;:7802-7812; and Zhang 
Y. et al. (1995) Oncogene 10(1 l):20S5-2093). The E2F-DP transcription factors are major 
regulators of genes that are required for the progression of S-phase, such as DHFR and DNA 
polymerase alpha, and they play a critical role in cell cycle regulation and differentiation 
The retinoblastoma tumor suppressor protein has been shown to induce growth airest by 
binding to E2F.DP and repressing its activity. Lastiy. CDC- 1 6 is ceil cycle protein 
Identified in mammalian and yeast cells. This protein has been shown to localize to the 
centrosome and mitotic spindle and is essential for the metaphase to anaphase transition 
during cell division (Lamb, J. R. et al. (1994) EMBO J. 13(18}:4321^32S). 

There is a great deal of interest in identifying the genes that encode cell cycle 
regulatory proteins in plants. These genes may be used to express cell cycle regulatory 
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nucleic acid sequences encoding all or a portion of a cell cycle regulatory protein would 
facilitate studies to better understand cell cycle regulation in plants, provide genetic tools to 
enhance cell growth in tissue culture, increase the efficiency of gene transfer and help 
provide more stable transformations. Cell cycle regulatory proteins may also provide targets 
5 to facilitate design and/or identification of inhibitors of cell cycle regulatory proteins that 
may be useful as herbicides. 

, SUMMARY OF THE INVENTION 
The instant invention relates to isolated nucleic acid fragments encoding cell cycle 
regulatory proteins. Specifically, this invention concerns an isolated nucleic acid fragment 
10 encoding a CDC- 16, DP-1, DP-2 or E2F protein. In addition, this invention relates to a 
nucleic acid fragment that is complementary to the nucleic acid fragment encoding a 
CDC-16, DP-1, DP-2 or E2F protein. 

An additional embodiment of the instant invention pertains to a polypeptide encoding 
all or a substantial portion of a cell cycle regulatory protein selected from the group 
15 consisting of CDC-16, DP-1, DP-2 and E2F. 

In another embodiment, the instant invention relates to a chimeric gene encoding a 
CDC-16, DP-1, DP-2 or E2F protein, or to a chimeric gene that comprises a nucleic acid 
fragment that is complementary to a nucleic acid fragment encoding a CDC-1 6, DP- 1 , DP-2 
or E2F protein, operably linked to suitable regulatory sequences, wherein expression of the 
20 chimeric gene results in production of levels of the encoded protein in a transformed host 
cell that is altered (i.e., increased or decreased) from the level produced in an untransformed 
host cell. 

In a further embodiment, the instant invention concerns a transformed host cell 
comprising in its genome a chimeric gene encoding a CDC-16, DP-1, DP-2 or E2F protein, 

25 operably linked to suitable regulatory sequences. Expression of the Ghimeric gene results in 
production of altered levels of the encoded protein in the transformed host cell. The 
transformed host cell can be of eukaryotic or prokaryotic origin, and include cells derived 
from higher plants and microorganisms. The invention also includes transformed plants that 
arise from transformed host cells of higher plants, and seeds derived from such transformed 

30 plants. 

An additional embodiment of the instant invention concerns a method of altering the 
level of expression of a CDC-16, DP-1, DP-2 or E2F protein in a transformed host cell 
comprising: a) transforming a host cell with a chimeric gene comprising a nucleic acid 
fragment encoding a CDC-16, DP-1, DP-2 or E2F protein; and b) growing the transformed 
35 host cell under conditions that are suitable for expression of the chimeric gene wherein 
expression of the chimeric gene results in production of altered levels of CDC-16, DP-1, 
DP-2 or E2F protein in the transformed host cell. 
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An addition embodiment of the instant invention concerns a method for obtaining a 
nucleic acid fragment encoding all or a substantial portion of an amino acid sequence 
encoding a CDC- 16, DP-1, DP-2 or E2F protein. 

A further embodiment of the instant invention is a method for evaluating at least one 
5 compound for its ability to inhibit the activity of a CDC- 16, DP-1, DP-2 or E2F protein, the 
method comprising the steps of: (a) transforming a host cell with a chimeric gene 
comprising a nucleic acid fragment encoding a CDC- 1 6, DP- 1 , DP-2 or E2F protein, 
operably linked to suitable regulatory sequences; (b) growing the transformed host cell under 
conditions that are suitable for expression of the chimeric gene wherein expression of the 
10 chimeric gene results in production of CDC-1 6, DP-l, DP-2 or E2F protein in the 

transformed host cell; (c) optionally purifying the CDC- 16, DP-1, DP-2 or E2F protein 
expressed by the transformed host cell; (d) treating the CDC- 16, DP-1 , DP-2 or E2F protein 
with a compound to be tested; and (e) comparing the activity of the CDC- 16, DP-1 , DP-2 or 
E2F protein that has been treated with a test compound to the activity of an untreated 
15 CDC- 1 6, DP-1 , DP-2 or E2F protein, thereby selecting compounds with potential for 
inhibitory activity. 

BRIEF DESCRIPTION OF THE SEQUENCE LISTINGS 
The invention can be more fully understood from the following detailed description 
and the accompanying Sequence Listing which form a part of this application. 
20 The following sequence descriptions and Sequence Listing attached hereto comply 

with the rules governing nucleotide and/or amino acid sequence disclosures in patent 
applications as set forth in 37 C.F.R. §1.821-L825. 

SEQ ID NO:l is the nucleotide sequence comprising a portion of the cDNA insert in 
clone cpflc.pk001.kl3 encoding a portion of a com CDC- 1 6 protein. 
25 SEQ ID NO:2 is the deduced amino acid sequence of a portion of a CDC- 16 protein 

derived from the nucleotide sequence of SEQ ID NO:l. 

SEQ ID NO:3 is the nucleotide sequence comprising a portion of the cDNA insert in 
clone sfll.pk0030.e3 encoding a portion of a soybean CDC- 16 protein. 

SEQ ID NO:4 is the deduced amino acid sequence of a portion of a CDC-16 protein 
30 derived from the nucleotide sequence of SEQ ID NO:3, 

SEQ ID NO:5 is the nucleotide sequence comprising a portion of the cDNA insert in 
clone ids.pk0025.f7 encoding a portion of an Impatiens balsamia DP-1 protein. 

SEQ ID NO: 6 is the deduced amino acid sequence of a portion of a DP-1 protein 
derived from the nucleotide sequence of SEQ ID NO:5. 
35 SEQ ID NO:7 is the nucleotide sequence comprising a portion of the cDNA insert in 

clone p0072.comfs64r encoding a portion of a corn DP- 1 protein. 

SEQ ID NO:8 is the deduced amino acid sequence of a portion of a DP-1 protein 
derived from the nucleotide sequence of SEQ ID N0:7. 
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SEQ ID NO:9 is the nucleotide sequence comprising a portion of the cDNA insert in 
clone src3c.pk01 8.ml 1 encoding a portion of a soybean DP-1 protein. 

SEQ ID NO: 10 is the deduced amino acid sequence of a portion of a DP-1 protein 
derived from the nucleotide sequence of SEQ ID NO:9. 
5 SEQ ID NO: 11 is the nucleotide sequence comprising a contig assembled from the 

cDNA inserts in clones p0005.cbmfh22r. cdelc.pk001.jl3 and cen3n.pk0183.b9 encoding a 
portion of a com DP-2 protein. 

SEQ ID NO: 12 is the deduced amino acid sequence of a portion of a DP-2 protein 
derived from the nucleotide sequence of SEQ ID NO:l 1. 
10 SEQ ID NO: 1 3 is the nucleotide sequence comprising the entire cDNA insert in clone 

wlmkl.pk0005.e2 encoding a portion of a wheat DP-2 protein. 

SEQ ID NO; 14 is the deduced amino acid sequence of a portion of a DP-2 protein 
derived from the nucleotide sequence of SEQ ID NO: 13. 

SEQ ID NO: 15 is the nucleotide sequence comprising a portion of the cDNA insert in 
15 clone rsl 1 n.pk004.d 1 5 encoding a portion of a rice E2F protein. 

SEQ ID NO: 16 is the deduced amino acid sequence of a portion of an E2F protein 
derived from the nucleotide sequence of SEQ ID NO: 15. 

SEQ ID NO: 17 is the nucleotide sequence comprising the entire cDNA insert in clone 
sel.pk0012.f4 encoding a portion of a soybean E2F protein. 

SEQ ID NO: 1 8 is the deduced amino acid sequence of a portion of an E2F protein 
derived from the nucleotide sequence of SEQ ID NO: 17. 

The Sequence Listing contains the one letter code for nucleotide sequence characters 
and the three letter codes for amino acids as defined in confonnity with the lUPAC-IUBMB 
standards described in Nucleic Acids Research 75:3021-3030 (1985) and in the Biochemical 
Journal 219 (No. 2;:345-373 (1984) which are herein incorporated by reference. The 
symbols and format used for nucleotide and amino acid sequence data comply with the rules 
set forth in 37 C.F.R. §1 .822. 

DETAILED HPS PRIPTION OF THE INVENTION 
In the context of this disclosure, a number of terms shall be utilized. As used herein, 
an "isolated nucleic acid fragment" is a polymer of RNA or DNA that is single- or double- 
stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An 
isolated nucleic acid fragmem in the form of a polymer of DNA may be comprised of one or 
more segments of cDNA, genomic DNA or synthetic DNA. As used herein, "contig" refers 
to an assemblage of overiapping nucleic acid sequences to fonn one contiguous nucleotide 
sequence. For example, several DNA sequences can be compared and aligned to identify 
common or overiapping regions. The individual sequences can then be assembled into a 
single contiguous nucleotide sequence. 
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As used herein, **substanlially similar^ refers to nucleic acid fragments wherein 
changes in one or more nucleotide bases results in substitution of one or more amino acids, 
but do not affect the functional properties of the protein encoded by the DN A sequence. 

"Substantially similar" also refers to nucleic acid fragments wherein changes in one or 

5 more nucleotide bases does not affect the ability of the nucleic acid fragment to mediate 
alteration of gene expression by antisense or co-suppression technology. "Substantially 
similar" also refers to modifications of the nucleic acid fragments of the instant invention 
such as deletion or insertion of one or more nucleotides that do not substantially affect the 
functional properties of the resulting transcript vis-a-vis the ability to mediate alteration of 

10 gene expression by antisense or co-suppression technology or alteration of the functional 
properties of the resulting protein molecule. It is therefore understood that the invention 
encompasses more than the specific exemplary sequences. 

For example, it is well known in the art that antisense suppression and co-suppression 
of gene expression may be accomplished using nucleic acid fragments representing less than 

15 the entire coding region of a gene, and by nucleic acid fragments that do not share 1 00% 
sequence identity with the gene to be suppressed. Moreover, alterations in a gene which 
result in the production of a chemically equivalent amino acid at a given site, but do not 
effect the functional properties of the encoded protein, are well known in the art. Thus, a 
codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon 

20 encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, 
such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one 
negatively charged residue for another, such as aspartic acid for glutamic acid, or one 
positively charged residue for another, such as lysine for arginine, can also be expected to 
produce a functionally equivalent product. Nucleotide changes which result in alteration of 

25 the N-terminal and C-terminal portions of the protein molecule would also not be expected 
to alter the activity of the protein. Each of the proposed modifications is well within the 
routine skill in the art, as is determination of retention of biological activity of the encoded 
products. 

Moreover, substantially similar nucleic acid fragments may also be characterized by 
30 their ability to hybridize, under stringent conditions (O.IX SSC, 0.1% SDS, 65X), with the 
nucleic acid fragments disclosed herein. 

Substantially similar nucleic acid fragments of the instant invention may also be 
characterized by the percent similarity of the amino acid sequences that they encode to the 
amino acid sequences disclosed herein, as determined by algorithms commonly employed by 
35 those skilled in this art. Preferred are those nucleic acid fragments whose nucleotide 

sequences encode amino acid sequences that are 80% similar to the amino acid sequences 
reported herein. More preferred nucleic acid fragments encode amino acid sequences that 
are 90% similar to the amino acid sequences reported herein. Most preferred are nucleic acid 
fragments that encode amino acid sequences that are 95% similar to the amino acid 
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sequences reported herein. Sequence alignments and percent similarity calculations were 
performed using the Megalign program of the LASARGENE bioinformatics computing suite 
(DNASTAR Inc., Madison, WI). Multiple alignment of the sequences was performed using 
the Clustal method of alignment (Higgins, D. G. and Sharp. P. M. (1989) CABIOS. 
5:151-153) with the defauh parameters (GAP PENALTY=10, GAP LENGTH 
PENALTY=10) (hereafter Clustal algorithm). Default parameters for pairwise alignments 
using the Clustal method were KTUPLE 1, GAP PENALTY=3, WINDOW=5 and 
DIAGONALS SAVED=5. 

A ^'substantial portion'" of an amino acid or nucleotide sequence comprises enough of 
the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to afford 
putative identification of that polypeptide or gene, either by manual evaluation of the 
sequence by one skilled in the art, or by computer-automated sequence comparison and 
identification using algorithms such as BLAST (Basic Local Alignment Search Tool; 
Altschul, S. F., et al., (1993) J. Moi Biol 275:403-410; see also 

www.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or more contiguous amino 
acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide 
or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect 
to nucleotide sequences, gene specific oligonucleotide probes comprising 20-30 contiguous 
nucleotides may be used in sequence-dependent methods of gene identification (e.g.. 
Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or 
bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases may be used as 
amplification primers in PCR in order to obtain a particular nucleic acid fragment 
comprising the primers. Accordingly, a "substantial portion" of a nucleotide sequence 
comprises enough of the sequence to afford specific identification and/or isolation of a 
nucleic acid fragment comprising the sequence. The instant specification teaches partial or 
complete amino acid and nucleotide sequences encoding one or more particular plant 
proteins. The skilled artisan, having the benefit of the sequences as reported herein, may 
now use all or a substantial portion of the disclosed sequences for purposes known to those 
skilled in this art. Accordingly, the instant invention comprises the complete sequences as 
reported in the accompanying Sequence Listing, as well as substantial portions of those 
sequences as defined above. 

"Codon degeneracy" refers to divergence in the genetic code permitting variation of 
the nucleotide sequence without effecting the amino acid sequence of an encoded 
polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment that 
encodes all or a substantial portion of the amino acid sequence encoding the CDC- 16, DP-1 , 
DP-2orE2FproteinsassetforthinSEQIDNOs:2,4,6, 8, 10, 12, 14, 16 and 18. The 
skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of 
nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for 
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improved expression in a host cell, it is desirable to design the gene such that its frequency 
of codon usage approaches the frequency of preferred codon usage of the host cell. 

"Synthetic genes" can be assembled from oligonucleotide building blocks that are 
chemically synthesized using procedures known to those skilled in the art. These building 
5 blocks are ligated and annealed to form gene segments which are then enzymatically 

assembled to construct the entire gene. "Chemically synthesized", as related to a sequence 
of DNA, means that the component nucleotides were assembled in vitro. Manual chemical 
synthesis of DNA may be accomplished using well established procedures, or automated 
chemical synthesis can be performed using one of a number of commercially available 
10 machines. Accordingly, the genes can be tailored for optimal gene expression based on 
optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled 
artisan appreciates the likelihood of successful gene expression if codon usage is biased 
towards those codons favored by the host. Determination of preferred codons can be based 
on a survey of genes derived from the host cell where sequence information is available. 
15 "Gene** refers to a nucleic acid fragment that expresses a specific protein, including 

regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding 
sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its 
own regulatory sequences. "Chimeric gene" refers any gene that is not a native gene, 
comprising regulatory and coding sequences that are not found together in nature. 
20 Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that 
are derived from different sources, or regulatory sequences and coding sequences derived 
from the same source, but arranged in a manner different than that found in nature. 
"Endogenous gene" refers to a native gene in its natural location in the genome of an 
organism. A "foreign" gene refers to a gene not noraially found in the host organism, but 
25 that is introduced into the host organism by gene transfer. Foreign genes can comprise 

native genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene 
that has been introduced into the genome by a transformation procedure. 

"Coding sequence" refers to a DNA sequence that codes for a specific amino acid 
sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5* non- 
30 coding sequences), within, or downstream (3* non-coding sequences) of a coding sequence, 
and which influence the transcription, RNA processing or stability, or translation of the 
associated coding sequence. Regulatory sequences may include promoters, translation 
leader sequences, introns, and polyadenyiation recognition sequences. 

"Promoter'* refers to a DNA sequence capable of controlling the expression of a 
35 coding sequence or functional RNA. In general, a coding sequence is located 3' to a 

promoter sequence. The promoter sequence consists of proximal and more distal upstream 
elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a 
DNA sequence which can stimulate promoter activity and may be an innate element of the 
promoter or a heterologous element inserted to enhance the level or tissue-specificity of a 



BNSDOCIO: <WO__9Q53075A^_I.> 



I 



wo 99/53075 



m 



PCT/US99/07638 



10 



15 



20 



25 



30 



promoter. Promoters may be derived in their entirety from a native gene, or be composed of 
different elements derived from different promoters found in nature, or even comprise 
synthetic DNA segments. It is understood by those skilled in the art that different promoters 
may direct the expression of a gene in different tissues or cell types, or at different stages of 
development, or in response to different environmental conditions. Promoters which cause a 
gene to be expressed in most cell types at most times are commonly referred to as 
'^constitutive promoters''. New promoters of various types useful in plant cells are 
constantly being discovered; numerous examples may be found in the compilation by 
Okamuro and Goldberg, (1989) Biochemistry of Plants 75:1-82. It is further recognized that 
since in most cases the exact boundaries of regulatory sequences have not been completely 
defined, DNA fragments of different lengths may have identical promoter activity. 

The "translation leader sequence" refers to a DNA sequence located between the 
promoter sequence of a gene and the coding sequence. The translation leader sequence is 
present in the fully processed mRNA upstream of the translation start sequence. The 
translation leader sequence may affect processing of the primary transcript to mRNA, 
mRNA stability or translation efficiency. Examples of translation leader sequences have 
been described (Turner, R. and Foster, G. D. (1995) Molecular Biotechnology 5:225). 

The "3' non-coding sequences" refer to DNA sequences located downstream of a 
coding sequence and include polyadenylation recognition sequences and other sequences 
encoding regulatory signals capable of affecting mRNA processing or gene expression. Tlie 
polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid 
tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is 
exemplified by Ingelbrecht et ah, (1989) Plant Cell 7:671-680. 

"RNA transcript" refers to the product resulting from RNA polymerase-catalyzed 
transcription of a DNA sequence. When the RNA transcript is a perfect complementary 
copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA 
sequence derived from posttranscriptional processing of the primary transcript and is 
refen-ed to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is 
without inu-ons and that can be translated into protein by the cell. *'cDNA" refers to a 
double-stranded DNA that is complementary to and derived from mRNA. "Sense" RNA 
refers to RNA transcript that includes the mRNA and so can be translated into protein by the 
cell. *'Antisense RNA" refers to a RNA transcript that is complementary to all or part of a 
target primary transcript or mRNA and that blocks the expression of a tai^ct gene (U.S. Pat. 
No. 5,107,065, incorporated herein by reference). The complementarity of an antisense 
RNA may be with any part of the specific gene transcript, i.e., at the 5* non-coding sequence, 
3* non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to sense 
RNA, antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has 
an effect on cellular processes. 



8 




wo 99/53075 




PCT/US99/07d38 



The term "operably linked" refers to the association of nucleic acid sequences on a 
single nucleic acid fragment so that the function of one is affected by the other. For 
example, a promoter is operably linked with a coding sequence when it is capable of 
affecting the expression of that coding sequence (i.e., that the coding sequence is under the 
5 transcriptional control of the promoter). Coding sequences can be operably linked to 
regulatory sequences in sense or antisense orientation. 

The term "expression", as used herein, refers to the transcription and stable 
accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of 
the invention. Expression may also refer to translation of mRNA into a polypeptide. 
10 "Antisense inhibition" refers to the production of antisense RNA transcripts capable of 

suppressing the expression of the target protein. "Overexpression" refers to the production 
of a gene product in transgenic organisms that exceeds levels of production in normal or 
non-transformed organisms. "Co-suppression" refers to the production of sense RNA 
transcripts capable of suppressing the expression of identical or substantially similar foreign 
15 or endogenous genes (U.S. Patent No. 5,23 1 .020, incorporated herein by reference). 

"Altered levels" refers to the production of gene product(s) in transgenic organisms in 
amounts or proportions that differ from that of nonmal or non-transformed organisms. 

"Mature" protein refers to a post-translationally processed polypeptide; i.e., one from 
which any pre- or propeptides present in the primary translation product have been removed. 
20 "Precursor" protein refers to the primary product of translation of mRNA; i.e., with pre- and 
propeptides still present. Pre- and propeptides may be but are not limited to intracellular 
localization signals. 

A "chloroplast transit peptide" is an amino acid sequence which is translated in 
conjunction with a protein and directs the protein to the chloroplast or other plastid types 

25 present in the cell in which the protein is made. "Chloroplast transit sequence" refers to a 
nucleotide sequence that encodes a chloroplast transit peptide. A "signal peptide" is an 
amino acid sequence which is translated in conjunction with a protein and directs the protein 
to the secretory system (Chrispeels, J. J., {]99\) Ann. Rev, Plant Phys. Plant MoL BioL 
•^2:21-53). If the protein is to be directed to a vacuole, a vacuolar targeting signal {supra) 

30 can further be added, or if to the endoplasmic reticulum, an endoplasmic reticulum retention 
signal {supra) may be added. If the protein is to be directed to the nucleus, any signal 
peptide present should be removed and instead a nuclear localization signal included 
(Raikhel (1992) Plant Phys. 700:1627-1632). 

"Transformation" refers to the transfer of a nucleic acid fragment into the genome of a 

35 host organism, resulting in genetically stable inheritance. Host organisms containing the 
transformed nucleic acid fragments are referred to as "transgenic" organisms. Examples of 
methods of plant transformation include Agrobacterium-mediated transformation (De Blaere 
et al. (1987) Meth Enzymoi 143:211) and particle-accelerated or "gene gun" transformation 
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technology (Klein et al. (1987) Nature (London) J27:70-73; U.S. Patent No. 4,945,050, 
incorporated herein by reference). 

Standard recombinant DNA and molecular cloning techniques used herein are well 
known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T. 
5 Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold 
Spring Harbor, 1 989 (hereinafter "Maniatis"). 

Nucleic acid fragments encoding at least a portion of several cell cycle regulatory 
proteins have been isolated and identified by comparison of random plant cDNA sequences 
to public databases containing nucleotide and protein sequences using the BLAST 
10 algorithms well known to those skilled in the art. Table 1 lists the proteins that are described 
herein, and the designation of the cDN A clones that comprise the nucleic acid fragments 
encoding these proteins. 





TABLE 1 






Cell Cycle Regulatory Proteins 






Enzyme Clone 


Plant 


CDC-16 


cpflc.pk001.kI3 


Corn 




sfll.pk0O30.e3 


Soybean 


DP-l 


ids.pk0025.f7 


Impaiiens 




p0072xomfs64r 


Com 




src3c,pk018.mll 


Soybean 


DP.2 


p0005.cbmfh22r 


Com 




cdelc.pk001.jl3 


Com 




cen3n.pk0183.b9 


Com 




wlmkl.pk0005.e2 


Wheal 


E2F 


rslln.pk004.dl5 


Rice 




seLpk0012.f4 


Soybean 



The nucleic acid fragments of the instant invention may be used to isolate cDNAs and 
genes encoding homologous proteins from the same or other plant species. Isolation of 
homologous genes using sequence-dependent protocols is well known in the art. Examples 
20 of sequence-dependent protocols include, but are not limited to, methods of nucleic acid 

hybridization, and methods of DNA and RNA amplification as exemplified by various uses 
of nucleic acid amplification technologies (e.g., polymerase chain reaction, ligase chain 
reaction). 

For example, genes encoding other CDC-16, DP-1, DP-2 or E2F proteins, either as 
25 cDNAs or genomic DNAs, could be isolated directly by using all or a portion of the instant 
nucleic acid fragments as DNA hybridization probes to screen libraries from any desired 
plant employing methodology well known to those skilled in the art. Specific 

10 
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oligonucleotide probes based upon the instant nucleic acid sequences can be designed and 
synthesized by methods known in the art (Maniatis). Moreover, the entire sequences can be 
used directly to synthesize DNA probes by methods known to the skilled artisan such as 
random primer DNA labeling, nick translation, or end-labeling techniques, or RNA probes 
5 using available in vitro transcription systems. In addition, specific primers can be designed 
and used to amplify a part or all of the instant sequences. The resulting amplification 
products can be labeled directly during amplification reactions or labeled after amplification 
reactions, and used as probes to isolate fiill length cDNA or genomic fragments under 
conditions of appropriate stringency. 
10 In addition, two short segments of the instant nucleic acid fragments may be used in 

polymerase chain reaction protocols to amplify longer nucleic acid fragments encoding 
homologous genes from DNA or RNA. The polymerase chain reaction may also be 
performed on a library of cloned nucleic acid fragments wherein the sequence of one primer 
is derived from the instant nucleic acid fragments, and the sequence of the other primer takes 
15 advantage of the presence of the polyadenylic acid tracts to the 3' end of the mRNA 

precursor encoding plant genes. Alternatively, the second primer sequence may be based 
upon sequences derived from the cloning vector. For example, the skilled artisan can follow 
the RACE protocol (Frohman et al., (1988) PNAS USA 55:8998) to generate cDNAs by 
using PGR to amplify copies of the region between a single point in the transcript and the 3' 
20 or 5' end. Primers oriented in the 3' and 5* directions can be designed from the instant 

sequences. Using commercially available 3* RACE or 5' RACE systems (BRL), specific 3' 
or y cDNA fragments can be isolated (Ohara et al., (1989) PNAS USA 86:5673; Loh et al., 
(1989) Science 243:2\7). Products generated by the 3* and 5* RACE procedures can be 
combined to generate full-length cDNAs (Frohman, M. A. and Martin, G. R., (1989) 
25 Techniques J : 1 65). 

Availability of the instant nucleotide and deduced amino acid sequences facilitates 
immunological screening of cDNA expression libraries. Synthetic peptides representing 
portions of the instant amino acid sequences may be synthesized. These peptides can be 
used to immunize animals to produce polyclonal or monoclonal antibodies with specificity 
30 for peptides or proteins comprising the amino acid sequences. These antibodies can be then 
be used to screen cDNA expression libraries to isolate full-length cDNA clones of interest 
(Lemer, R. A. (1984) /irfv. Immunol, 36:1; Maniatis). 

The nucleic acid fragments of the instant invention may be used to create transgenic 
plants in which the disclosed CDC- 16, DP-1, DP-2 or E2F proteins are present at higher or 
35 lower levels than normal or in cell types or developmental stages in which they arc not 
normally found. This would have the effect of altering cell cycle regulation those cells. 

Overexpression of the CDC-16, DP-1 , DP-2 or E2F proteins of the instant invention 
may be accomplished by first constructing a chimeric gene in which the coding region is 
operably linked to a promoter capable of directing expression of a gene in the desired tissues 

11 
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at the desired stage of development. For reasons of convenience^ the chimeric gene may 
comprise promoter sequences and translation leader sequences derived from the same genes. 
3' Non-coding sequences encoding transcription termination signals may also be provided. 
The instant chimeric gene may also comprise one or more introns in order to facilitate gene 
expression. 

Piasmid vectors comprising the instant chimeric gene can then constructed. The 
choice of piasmid vector is dependent upon the method that will be used to transform host 
plants. The skilled artisan is well aware of the genetic elements that must be present on the 
piasmid vector in order to successfully transform, select and propagate host cells containing 
the chimeric gene. The skilled artisan will also recognize that different independent 
transformation events will result in different levels and patterns of expression (Jones et al„ 
(1985) EMBOJ, 4:2A\ 1-2418; De Almeida et al., (1989) Mol Gen. Genetics 275:78.86), 
and thus that multiple events must be screened in order to obtain lines displaying the desired 
expression level and pattern. Such screening may be accomplished by Southern analysis of 
DNA, Northern analysis of mRNA expression. Western analysis of protein expression, or 
phenotypic analysis. 

It may also be desirable to reduce or eliminate expression of genes encoding CDC- 16, 
DP-1 , DP-2 or E2F proteins in plants for some applications. In order to accomplish this, a 
chimeric gene designed for co-suppression of the instant cell cycle regulatory proteins can be 
constructed by linking a gene or gene fragment encoding a CDC- 16, DP-1, DP'2 or E2F 
protein to plant promoter sequences. Alternatively, a chimeric gene designed to express 
antisense RNA for all or part of the instant nucleic acid fragment can be constructed by 
linking the gene or gene fragment in reverse orientation to plant promoter sequences. Either 
the co-suppression or antisense chimeric genes could be introduced into plants via 
transformation wherein expression of the corresponding endogenous genes are reduced or 
eliminated. 

The instant CDC-1 6, DP-1 , DP-2 or E2F proteins (or portions thereof) may be 
produced in heterologous host cells, particularly in the cells of microKal hosts, and can be 
used to prepare antibodies to the these proteins by methods well known to those skilled in 
the art. The antibodies are useful for detecting CDC- 16, DP-1 , DP-2 or E2F proteins in situ 
in cells or in vitro in cell exU-acts. Preferred heterologous host cells for production of the 
instant CDC- 16, DP-l, DP-2 or E2F proteins are microbial hosts. Microbial expression 
systems and expression vectors containing regulatory sequences that direct high level 
expression of foreign proteins are well known to those skilled in the art. Any of these could 
be used to construct a chimeric gene for production of the instant CDC- 16, DP-1, DP-2 or 
E2F proteins. This chimeric gene could then be introduced into appropriate microorganisms 
via transformation to provide high level expression of the encoded cell cycle regulatory 
protein. An example of a vector for high level expression of the instant CDC- 16, DP- 1, 
DP-2 or E2F proteins in a bacterial host is provided (Example 9). 

12 
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Additionally, the instant CDC-16, DP-1, DP-2 or E2F proteins can be used as a targets 
to facilitate design and/or identification of inhibitors of those proteins that may be useful as 
herbicides. This is desirable because the CDC- 16, DP-1, DP-2 or E2F proteins described 
herein are involved in the regulation of cell cycle. Accordingly, inhibition of the activity of 
5 one or more of the proteins described herein could lead to inhibition plant growth. Thus, the 
instant CDC- 16, DP-1, DP-2 or E2F proteins could be appropriate for new herbicide 
discovery and design. 

All or a substantial portion of the nucleic acid fragments of the instant invention may 
also be used as probes for genetically and physically mapping the genes that they are a part 
10 of, and as markers for traits linked to those genes. Such information may be useful in plant 
breeding in order to develop lines with desired phenotypes. For example, the instant nucleic 
acid fragments may be used as restriction fragment length polymorphism (RFLP) markers. 
Southern blots (Maniatis) of restriction-digested plant genomic DNA may be probed with 
the nucleic acid firagments of the instant invention. The resulting banding patterns may then 
15 be subjected to genetic analyses using computer programs such as MapMaker (Lander et aL, 
(1987) Genomics 7:1 74-1 8 1) in order to construct a genetic map. In addition, the nucleic 
acid fragments of the instant invention may be used to probe Southern blots containing 
restriction endonuclease-treated genomic DNAs of a set of individuals representing parent 
and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted 
20 and used to calculate the position of the instant nucleic acid sequence in the genetic map 
previously obtained using this population (Botstein, D. et al., (1980) Am, J. Hum. Genet. 
J2:3 14-331). 

The production and use of plant gene-derived probes for use in genetic mapping is 
described in R. Bematzky, R. and Tanksley, S. D. (1986) Plant MoL Biol Reporter 
25 4(1):21'A 1 . Numerous publications describe genetic mapping of specific cDNA clones 
using the methodology outlined above or variations thereof. For example, F2 intercross 
populations, backcross populations, randomly mated populations, near isogenic lines, and 
other sets of individuals may-be-used for mapping. Such methodologies are well known to 
those skilled in the art. 

30 Nucleic acid probes derived from the instant nucleic acid sequences may also be used 

for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel, J. D., et 
al.. In: Nonmammalian Genomic Analysis: A Practical Guide^ Academic press 1996, 
pp. 3 19-346, and references cited therein). 

In another embodiment, nucleic acid probes derived from the instant nucleic acid 

35 sequences may be used in direct fluorescence in situ hybridization (FISH) mapping (Trask, 
B. J. (1991) Trends Genet, 7:149-154). Although current methods of FISH mapping favor 
use of large clones (several to several hundred KB; see Laan, M. et al. (1995) Genome 
Research 5:13-20), improvements in sensitivity may allow performance of FISH mapping 
using shorter probes. 

13 
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A variety of nucleic acid amplification-based methods of genetic and physical 
mapping may be carried out using the instant nucleic acid sequences. Examples include 
allele-specific amplificaUon (Kazazian. H. H. (1989)7. Lab. Clin. Med. IJ4(2):95-96), 
polymorphism of PCR-amplified fragments (CAPS; Sheffield, V. C. et al. (1993) Genomics 
5 /<5:325-332), allele-specific ligation (Landegren, U. et al. (1988) Science 2-/7: 1077- 1080), 
nucleotide extension reactions (Sokolov, B. P. (1990) Nucleic Acid Res. 75:3671), Radiation 
Hybrid Mapping (Walter, M. A. et al. (1997) Nature Genetics 7:22-28) and Happy Mapping 
(Dear, P. H. and Cook, P. R. (1989) Nucleic Acid Res. 1 7:6795-6807). For these methods, 
the sequence of a nucleic acid fragment is used to design and produce primer pairs for use in 

10 the amplification reaction or in primer extension reactions. The design of such primers is 
well known to those skilled in the art. In methods employing PCR-based genetic mapping, 
it may be necessary to identify DNA sequence differences between the parents of the 
mapping cross in the region corresponding to the instant nucleic acid sequence. This, 
however, is generally not necessary for mapping methods. 

15 Loss of function mutant phenotypes may be identified for the instant cDNA clones 

either by targeted gene disruption protocols or by identifying specific mutants for these 
genes contained in a maize population carrying mutations in all possible genes (Ballinger 
and Benzer, (1989) Proc. Natl. Acad Sci USA 86:9402; Koes et al., (1995) Proc. Natl. Acad 
Sci USA 92:8149; Bensen et al., (1995) Plant Cell 7:75). The latter approach may be 

20 accomplished in two ways. First, short segments of the instant nucleic acid fragments may 
be used in polymerase chain reaction protocols in conjunction with a mutation tag sequence 
primer on DNAs prepared from a population of plants in which Mutator transposons or some 
other mutation-causing DNA element has been introduced (see Bensen, supra). The 
amplification of a specific DNA fragment with these primers indicates the insertion of the 

25 mutation tag clement in or near the plant gene encoding the CDC- 1 6, DP- 1 , DP.2 or E2F 
protein. Alternatively, the instant nucleic acid fragment may be used as a hybridization 
probe against PGR amplification products generated from the mutation population using the 
mutation tag sequence primer in conjunction with an arbitrary genomic site primer, such as 
that for a restriction enzyme site-anchored synthetic adaptor. Witii eiUier method, a plant 

30 containing a mutation in the endogenous gene encoding a CDC- 1 6, DP- 1 , DP-2 or E2F 
protein can be identified and obtained. This mutant plant can then be used to determine or 
confirm the natural fiinction of the CDC-16, DP-l, DP-2 or E2F protein gene product. 

EXAMPLES 

The present invention is further defined in the following Examples, in which all parts 
35 and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be 
understood that these Examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only. From the above discussion and these Examples, one 
skilled in the art can ascertain the essential characteristics of this invention, and without 
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departing from the spirit and scope thereof, can make various changes and modifications of 
the invention to adapt it to various usages and conditions. 

EXAMPLE 1 

rom position of cDNA Libraries: Isolation and Seque ncing of cDNA Clones 
cDNA libraries representing mRNAs from various com, ImpatienSy rice, soybean and 
wheat tissues were prepared. The characteristics of the libraries are described below. 



TABLE 2 

cDNA Libraries from Com, Impaiiens, Rice. Soybean and Wheat 



Library 


Tissue 


Clone 


cdelc 


Com developing embryo 20 days after pollination 


cdelc.pk001.jl3 


cen3n 


Com endosperm days after pollination* 


cen3n.pk0l83.b9 


cpflc 


Com tissue treated with chemicals related to protein 


cpflc.pk00Lkl3 


synthesis and pooled*** 




ids 


Impatiens balsamina developing seed 


ids,pk0025.n 


p0005 


Com immature ear 


p0005.cbmfti22r 


p0072 


Com 14 days after planting etiolated seedling: Mesocotyi 


p0072.comfs64r 


rslln 


Rice 1 5 day old seedling* 


rslln.pk004.dlS 


sel 


Soybean embryo, 6-10 day after flowering 


sel.pk0012.f4 


sfll 


Soybean immature flower 


sfll.pk0030.e3 


src3c 


Soybean 8 day old root inoculated with eggs of cyst 
nematode Heterodera glycines (Race 14) for 4 days. 


src3c.pk018.mll 


wlmkl 


Wheat seedlings 1 hr after inoculation with Erysiphe 
^raminis f. sp tritici and treatment with funRicide** 


wlmkl. pk0005.e2 



10 *These libraries were normalized essentially as described in U.S. Patent No. 5,482,845 



♦♦Application of 6-iodo-2-propoxy-3-propyl-4(3H)-quinazolinone; synthesis and methods 
of using this compound are described in USSN 08/545.827, incorporated herein by 
reference. 

♦♦♦Chloramphenicol, cyclohexamide. neomycin sulfate and aurinlricarboxilic acid 
15 commercially available from Sigma Chemical Co. and Calbiochem-Novabiochem Corp 
Calbiochem. 

cDNA libraries were prepared in Uni-ZAP™ XR vectors according to the 
manufacturer's protocol (Stratagene Cloning Systems, La Jolla, CA). Conversion of the 

20 Uni-ZAP"^^ XR libraries into plasmid libraries was accomplished according to the protocol 
provided by Stratagene. Upon conversion, cDNA inserts were contained in the plasmid 
vector pBluescript. cDNA inserts from randomly picked bacterial colonies containing 
recombinant pBluescript plasmids were amplified via polymerase chain reaction using 
primers specific for vector sequences flanking the inserted cDN A sequences or plasmid 

25 DNA was prepared from cultured bacterial cells. Amplified insert DN As or plasmid DN As 
were sequenced in dye-primer sequencing reactions to generate partial cDNA sequences 
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20 



(expressed sequence tags or "ESTs"; see Adams, M. D. et al., (1991) Science 252:1651). 
The resulting ESTs were analyzed using a Perkin Elmer Model 377 nuorescent sequencer. 

EXAMPLE 2 
Identification of cDNA Clones 
5 ESTs encoding cell cycle regulatory proteins were identified by conducting BLAST 

(Basic Local Alignment Search Tool; Altschul. S. F.. et al.. (1993) J. MoL Biol. 
2i5:403^10; see also www.ncbi.nlm.nih.gov/BLAST/) searches for similarity to 
sequences contained in the BLAST "nr" database (comprising all non-redundant GenBank 
CDS translations, sequences derived from the 3-dimensional structure Brookhaven Protein 
10 Data Bank, the last major release of the SWISS-PROT protein sequence database, EMBL. 
and DDBJ databases). The cDNA sequences obtained in Example 1 were analyzed for 
similarity to all publicly available DNA sequences contained in the "nr" database using the 
BLASTN algorithm provided by the National Center for Biotechnology Information 
(NCBI). The DNA sequences were translated in all reading frames and compared for 
15 similarity to all publicly available protein sequences contained in the "nr" database using 
the BLASTX algorithm (Gish, W. and States, D. J. (1993) Nature Genetics 5:266-272 and 
Altschul, Stephen F., et al. (1997) Nucleic Acids Res. 25:3389-3402) provided by the NCBI. 
For convenience, the P-value (probability) of observing a match of a cDNA sequence to a 
sequence contained in the searched databases merely by chance as calculated by BLAST 
are reported herein as "pLog" values, which represent the negative of the logarithm of the 
reported P-value. Accordingly, the greater the pLog value, the greater the likelihood that 
the cDNA sequence and the BLAST "hit" represent homologous proteins, 

EXAMPLE 3 

Characterization of cDNA Clones Encoding CDC-1 6 Prntsmg 
25 The BLASTX search using the EST sequences from clones cpflc.pkOOl.k 13 and 

sfll.pk0030.e3 revealed similarity of the proteins encoded by the cDNAs to CDC- 16 protein 
from Homo sapiens (NCBI IdenUfier No. gi 1 362769). The BLAST results for each of these 
ESTs are shown in Table 3: 

3® TABLE 3 

BLAST Results for Clones Encoding Polypeptides Homologous 
' to Homo sapiens CDC- 16 Protein 

BLAST pLog Score 

cpflc.pk001.kl3 1100 
sfll.pk0030.e3 5O.40 

The sequence of a portion of the cDNA insert fi-om clone cpflc.pkOOl.k 13 is shown in 
SEQ ID N0:1; the deduced amino acid sequence of this cDNA, which represents 14% of the 
N-terminal region of the protein, is shown in SEQ ID N0:2. A calculation of the percent 

16 
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similarity of the amino acid sequence set forth in SEQ ID NO:2 and the Homo sapiens 
sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:2 is 
40% similar to the Homo sapiens CDC- 16 protein. 

The sequence of a portion of the cDNA insert from clone sfll .pk0030.e3 is shown in 

5 SEQ ID NO:3; the deduced amino acid sequence of this cDNA, which represents 43% of the 
protein (middle region), is shown in SEQ ID NO:4. A calculation of the percent similarity 
of the amino acid sequence set forth in SEQ ID N0:4 and the Homo sapiens sequence (using 
the Clustal algorithm) revealed that the protein encoded by SEQ ID N0:4 is 46% similar to 
the Homo sapiens CDC- 16 protein. The percent similarity between the com and soybean 

10 amino acid sequence was calculated to be 1 8% using the Clustal algorithm. 

BLAST scores and probabilities indicate that the instant nucleic acid fragments encode 
portions of CDC-16 proteins. These sequences represent the first plant sequences encoding 
CDC- 16 proteins. 

EXAMPLE 4 

15 Characterization of cDNA Clones Encoding DP-1 Protein 

The BLASTX search using the EST sequence from clone ids.pk0025.f7 revealed 
similarity of the protein encoded by the cDNA to DP-1 protein from Xenopus laevis (NCBI 
Identifier No. gi 913227). The BLASTX search using the EST sequences from clones 
p0072.comfs64r and src3c.pk018.ml 1 revealed similarity of the proteins encoded by the 
20 cDNAs to DP-1 protein from Mus musculus (NCBI Identifier No. gi 420232). 
The BLAST results for each of these ESTs are shown in Table 4: 



TABLE 4 

BLAST Results for Clones Encoding Polypeptides Homologous 
25 to Xenopus laevis and Mus musculus DP-1 Protein 



Clone 


BLAST pLog Score 


ids.pk0025.n 


23.30 


p0072.comfs64r 


5.00 


src3c.pk018.mll 


17.00 



The sequence of a portion of the cDNA insert from clone ids.pk0025.f7 is shown in 
SEQ ID NO:5; the deduced amino acid sequence of this cDNA, which represents 33% of the 
of the protein (middle region), is shown in SEQ ID NO:6. A calculation of the percent 
30 similarity of the amino acid sequence set forth in SEQ ID NO:8 and the Xenopus laevis 

sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:6 is 
46% similar to the Xenopus laevis DP-1 protein. 

The sequence of a portion of the cDNA insert from clone p0072xomfs64r is shown in 
SEQ ID NO:7; the deduced amino acid sequence of this cDNA, which represents 23% of the 
35 protein (middle region), is shown in SEQ ID N0:8. A calculation of the percent similarity 
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of the amino acid sequence set forth in SEQ ID NO:8 and the Mus musculus sequence (using 
the Clustal algorithm) revealed that the protein encoded by SEQ ID N0:8 is 37% similar to 
the Mus musculus DP-1 protein. 

The sequence of a portion of the cDNA insert from clone src3c.pk018.ml I is shown in 
SEQ ID NO:9; the deduced amino acid sequence of this cDNA, which represents 42% of the 
protein (middle region), is shown in SEQ ID NO: 1 0. A calculation of the percent similarity 
of the amino acid sequence set forth in SEQ ID NO: 1 0 and the Mus musculus sequence 
(using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO: 10 is 48% 
s\m\\dix ioth^ Mus musculus T>? A protein. 

The percent similarity between the com, Impatiens and soybean amino acid sequence 
was calculated to range between 3 1 to 78% using the Clustal algorithm. 

BLAST scores and probabilities indicate that the instant nucleic acid fragments encode 
portions of DP-1 proteins. These sequences represent the first plant sequences encoding 
DP-1 proteins. 

EXAMPLE 5 

Characterization of cDNA Clones Encoding DP-2 Protein 
The BLASTX search using the EST sequences from clones p0005.cbmfh22r, 
cdelc.pkOOl J13 and cen3n.pk0183.b9 revealed similarity of the proteins encoded by the 
cDNAs to DP-2 protein from Homo sapiens (NCBI Identifier No. gi 604479). The 
BLASTX search using the EST sequence from clone wlmkl.pk0005.e2 revealed similarity 
of the protein encoded by the cDNA to DP-2 protein from Mus species (NCBI Identifier 
No. gi 3122929). 

In the process of comparing the ESTs it was found that com clones p0005.cbmfli22r, 
cdelc.pk001.jl3 and cen3n.pk0183.b9 had overiapping regions of homology. Using this 
homology it was possible to align the ESTs and assemble a contig encoding a unique com 
DP-2 protein. 

The BLAST results for the com contig and the wheat EST are shown in Table 5: 





TABLE 5 


BLAST Results for Clones Encoding Polypeptides Homologous 




to Homo sapiens and Mus species DP-2 Protein 


Clone 


BLAST pLog Score 


Contig composed of: 


45.10 


p0005.cbmfh22r 


cdelc.pk001.jl3 




cen3n.pk0183.b9 




wImkLpk00O5.e2 


7.00 



The sequence of the com contig composed of clones p0005.cbmfh22r, 
cdelc.pk001.jl3 and cen3n.pk0183.b9 is shown in SEQ ID NO:l 1; the deduced amino acid 
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sequence of this contig, which represents 50% of the protein (middle region), is shown in 
SEQ ID NO: 12. A calculation of the percent similarity of the amino acid sequence set forth 
in SEQ ID NO: 12 and the Homo sapiens sequence (using the Clustal algorithm) revealed 
that the protein encoded by SEQ ID N0:8 is 52% similar to the Homo sapiens DP-2 protein. 

5 The sequence of the entire cDNA insert from clone wlmkl .pk0005-e2 is shown in SEQ 

ID N0:1 3; the deduced amino acid sequence of this cDNA, which represents 25% of the of 
the protein (middle region), is shown in SEQ ID NO: 14. A calculation of the percent 
similarity of the amino acid sequence set forth in SEQ ID NO: 14 and the Mus species 
sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO: 14 

10 is 44% similar to the Mus species DP-2 protein. 

The percent similarity between the com and wheat amino acid sequence was calculated 
to be 86% using the Clustal algorithm. 

BLAST scores and probabilities indicate that the instant nucleic acid fragments encode 
portions of DP-2 proteins. These sequences represent the first plant sequences encoding 

15 DP-2 proteins. 

EXAMPLE 6 

Characterization of cDNA Clones Encoding E2F Proteins 
The BLASTX search using the EST sequence from clone rslln.pk004.dl5 revealed 
similarity of the protein encoded by the cDNA to E2F-4 protein from Homo sapiens (NCBI 
20 Identifier No. gi 1 061 146). The BLASTX search using the EST sequence from clone 

sel:pk0012.f4 revealed similarity of the protein encoded by the cDNA to E2F protein from 
Drosophila melanogaster (NCBI Identifier No. gi 3551069). 

The BLAST results for each of the ESTs are shown in Table 6: 

25 TABLE 6 

BLAST Results for Clones Encoding Polypeptides Homologous 
to Homo sapiens and Drosophila melanogaster E2F Proteins 

Clone BLAST pLog Score 

rslln.pk004.dl5 5,52 

sel.pk0012.f4 ^ 12.00 

The sequence of a portion of the cDNA insert from clone rslln.pk004.dl 5 is shown in 
30 SEQ ID NO: 1 5; the deduced amino acid sequence of this cDN A, which represents 1 2% of 
the of the protein (middle region), is shown in SEQ ID NO: 16. A calculation of the percent 
similarity of the amino acid sequence set forth in SEQ ID NO: 16 and the Homo sapiens 
sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO: 1 6 
is 49% similar to the Homo sapiens E2F-4 protein. 
35 The sequence of the entire cDNA insert from clone sel.pk0012.f4 is shown in SEQ ID 

NO: 17; the deduced amino acid sequence of this cDNA, which represents 10% of the of the 
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protein (middle region), is shown in SEQ ID NO: 1 8. A calculation of the percent similarity 
of the amino acid sequence set forth in SEQ ID NO: 1 8 and the Drosophila melanogaster 
sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO: 18 
is 43% similar to the Drosophila melanogaster E2F protein. 
5 The percent similarity between the rice and soybean amino acid sequence was 

calculated to be 15% using the Clustal algorithm, 

BLAST scores and probabilities indicate that the instant nucleic acid fragments encode 
portions of E2F proteins. Tliese sequences represent the first plant sequences encoding E2F 
proteins. 

10 EXAMPLE 7 

Expression of Chimeric Genes in Monocot Cells 
A chimeric gene comprising a cDNA encoding a cell cycle regulatory protein in sense 
orientation with respect to the maize 27 kD zein promoter that is located 5' to the cDN A 
fragment, and the 10 kD zein 3* end that is located 3* to the cDNA fragment, can be 
15 constructed. The cDNA fragment of this gene may be generated by polymerase chain 

reaction (PGR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites 
(Ncol or Smal) can be incorporated into the oligonucleotides to provide proper orientation of 
the DNA fragment when inserted into the digested vector pML103 as described below. 
Amplification is then performed in a standard PGR. The amplified DNA is then digested 
20 with restriction enzymes Ncol and Smal and fractionated on an agarose gel. The appropriate 
band can be isolated from the gel and combined with a 4.9 kb Ncol-Smal fragment of the 
plasmid pML103. Plasmid pML103 has been deposited under the terms of the Budapest 
Treaty at ATCC (American Type Culture Collection, 10801 University Blvd., Manassas, VA 
201 10-2209), and bears accession number ATCC 97366. The DNA segment from pML103 
25 contains a 1.05 kb Sall-Ncol promoter fragment of the maize 27 kD zein gene and a 0.96 kb 
Smal-Sall fragment from the 3' end of the maize 10 kD zein gene in the vector pGem9Zf(+) 
(Promega). Vector and insert DNA can be ligated at 15**C overnight, essentially as 
described (Maniatts). The ligated DNA may then be used to transform £1 coli XL 1 -Blue 
(Epicurian Coli XL-1 Blue^**; Stratagene). Bacterial transformants can be screened by 
30 restriction enzyme digestion of plasmid DNA and limited nucleotide sequence analysis using 
the dideoxy chain termination method (Sequenase^** DNA Sequencing Kit; U.S. 
Biochemical). The resulting plasmid construct would comprise a chimeric gene encoding, in 
the 5* to 3* direction, the maize 27 kD zein promoter, a cDN A fragment encoding a cell cycle 
regulatory protein, and the 10 kD zein 3' region. 
35 The chimeric gene described above can then be introduced into com cells by the 

following procedure. Immature com embryos can be dissected from developing caryopses 
derived from crosses of the inbred com lines H99 and LH132. The embryos are isolated 10 
to 1 1 days after pollination when they are 1 .0 to 1 .5 mm long. The embryos are then placed 
with the axis-side facing down and in contact with agarose-solidified N6 medium (Chu 
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et al., (1 975) ScL Sin Peking 1 8:659-668). The embryos are kept in the dark at 27®C. 
Friable embryogenic callus consisting of undifTerentiated masses of cells with somatic 
proembryoids and embryoids borne on suspensor structures proliferates from the scutellum 
of these immatiire embryos. The embryogenic callus isolated from the primary explant can 

5 be cultured on N6 medium and sub-cultured on this medium every 2 to 3 weeks. 

The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst Ag, Frankfurt, 
Germany) may be used in transformation experiments in order to provide for a selectable 
marker. This plasmid contains the Pat gene (see European Patent Publication 0 242 236) 
which encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT confers 

10 resistance to herbicidal glutamine synthetase inhibitors such as phosphinothricin. The pat 
gene in p35S/Ac is under the control of the 35S promoter from Cauliflower Mosaic Virus 
(Odell et al. (1985) Nature 313:810-812) and the 3* region of the nopaline synthase gene 
from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens. 

The particle bombardment method (Klein et al., (1987) Nature 327:70-73) may be 

IS used to transfer genes to the callus culture cells. According to this method, gold particles 
(1 ^im in diameter) are coated with DNA using the following technique. Ten jig of plasmid 
DNAs are added to 50 |iL of a suspension of gold particles (60 mg per mL). Calcium 
chloride (50 ^iL of a 2.5 M solution) and spermidine free base (20 jiL of a 1.0 M solution) 
are added to the particles. The suspension is vortexed during the addition of these solutions. 

20 After 1 0 minutes, the tubes are briefly centrifuged (5 sec at 1 5,000 rpm) and the supernatant 
removed. The particles are resuspended in 200 of absolute ethanol, centrifuged again 
and the supernatant removed. The ethanol rinse is performed again and the particles 
resuspended in a final volume of 30 |iL of ethanol. An aliquot (5 jaL) of the DNA-coated 
gold particles can be placed in the center of a Kaplon^'^ flying disc (Bio-Rad Labs). The 

25 particles are then accelerated into the corn tissue with a Biolistic'^'^ PDS-lOOO/He (Bio-Rad 
Instruments, Hercules CA), using a helium pressure of 1000 psi, a gap distance of 0.5 cm 
and a flying distance of 1 .0 cm. 

For bombardment, the embryogenic tissue is placed on filter paper over agarose- 
solidified N6 medium. The tissue is arranged as a thin lawn and covered a circular area of 

30 about 5 cm in diameter. The petri dish containing the tissue can be placed in the chamber of 
the PDS-lOOO/He approximately 8 cm firom the stopping screen. The air in the chamber is 
then evacuated to a vacuum of 28 inches of Hg. The macrocarrier is accelerated with a 
helium shock wave using a rupture membrane that bursts when the He pressure in the shock 
tube reaches 1000 psi. 

35 Seven days after bombardment the tissue can be transferred to N6 medium that 

contains gluphosinate (2 mg per liter) and lacks casein or proline. The tissue continues to 
grow slowly on this medium. After an additional 2 weeks the tissue can be transferred to 
fresh N6 medium containing gluphosinate. After 6 weeks, areas of about 1 cm in diameter 
of actively growing callus can be identified on some of the plates containing the glufosinate- 
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supplemented medium. These calli may continue to grow when sub-cultured on the 
selective medium. 

Plants can be regenerated from the transgenic callus by first transferring clusters of 
tissue to N6 medium supplemented with 0.2 mg per liter of 2,4-D. After two weeks the 
5 tissue can be transferred to regeneration medium (Fromm et al., (1 990) Bio/Technology 
if:833-839). 

EXAMPLE 8 
Expression of Chimeric Genes in Dicot Cells 
A seed-specific expression cassette composed of the promoter and transcription 
10 terminator from the gene encoding the p subunit of the seed storage protein phaseolin from 
the bean Phaseolus vulgaris (Doyle et al. (1986) J, Biol Chem. 26\ :9228-9238) can be used 
for expression of the instant cell cycle regulatory protein in transformed soybean. The 
phaseolin cassette includes about 500 nucleotides upstream (5') from the translation initiation 
codon and about 1650 nucleotides downstream (3') from the translation stop codon of 
15 phaseolin. Between the 5* and 3' regions are the unique restriction endonuclease sites Nco 1 
(which includes the ATG translation initiation codon), Sma I, Kpn I and Xba L The entire 
cassette is flanked by Hind III sites. 

The cDNA fragment of this gene may be generated by polymerase chain reaction 
(PGR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites can be 
20 incorporated into the oligonucleotides to provide proper orientation of the DNA fragment 
when inserted into the expression vector. Amplification is then performed as described 
above, and the isolated fragment is inserted into a pUClS vector carrying the seed 
expression cassette. 

Soybean embroys may then be tmnsformed with the expression vector comprising a 
25 sequence encoding a cell cycle regulatory protein. To induce somatic embryos, cotyledons, 
3-5 mm in length dissected from surface sterilized, immature seeds of the soybean cultivar 
A2872, can be cultured in the light or dark at 26^ C on an appropriate agar medium for 
6-10 weeks. Somatic embryos which produce secondary embryos are then excised and 
placed into a suitable liquid medium. After repeated selection for clusters of somatic 
30 embryos which multiplied as early, globular staged embryos, the suspensions are maintained 
as described below. 

Soybean embryogenic suspension cultures can maintained in 35 mL liquid media on a 
rotary shaker, 150 rpm, at 26°C with florescent lights on a 16:8 hour day/night schedule. 
Cultures are subcultured every two weeks by inoculating approximately 35 mg of tissue into 
35 35 mL of liquid medium. 

Soybean embryogenic suspension cultures may then be transformed by the method of 
particle gun bombardment (Kline et al. (1987) Nature (London) J27:70, U.S. Patent 
No. 4,945,050). A DuPont Biolistic^" PDSIOOO/HE instrument (helium retrofit) can be used 
for these transformations. 

22 



3NSDCC1D: <WO_09S3075A2_I_> 



wo 99/53075 




PCT/US99/07d38 



A selectable marker gene which can be used to facilitate soybean transformation is a 
chimeric gene composed of the 35S promoter from Cauliflower Mosaic Vims (Odell et al. 
(1985) Nature iii:810-812), the hygromycin phosphotransferase gene from plasmid pJR225 
(from E. coli; Gritz et al. (1983) Gene 25:179-188) and the 3' region of the nopaline synthase 

5 gene from the T-DNA of the Ti plasmid ot Agrobacterium tumefaciens. The seed expression 
cassette comprising the phaseolin S' region, the fragment encoding the cell cycle regulatory 
protein and the phaseolin 3' region can be isolated as a restriction fragment This fragment 
can then be inserted into a unique restriction site of the vector carrying the marker gene. 
To 50 of a 60 mg/mL 1 ^im gold particle suspension is added (in order): 5 nL 

10 DNA (1 ^g/^iL), 20 \x\ spermidine (0.1 M), and 50 ^iL CaC^ (2.5 M). The particle 

preparation is then agitated for three minutes, spun in a microfuge for 10 seconds and the 
supernatant removed. The DNA-coated particles are then washed once in 400 |iL 70% 
ethanol and resuspended in 40 |iL of anhydrous ethanol. The DNA/particle suspension can 
be sonicated three times for one second each. Five |iL of the DNA-coated gold particles are 

15 then loaded on each macro carrier disk. 

Approximately 300-400 mg of a two-week-old suspension culture is placed in an 
empty 60x15 mm petri dish and the residual liquid removed from the tissue with a pipette. 
For each transformation experiment, approximately 5-10 plates of tissue are normally 
bombarded. Membrane rupture pressure is set at 1 100 psi and the chamber is evacuated to a 

20 vacuum of 28 inches mercury. The tissue is placed approximately 3.5 inches away from the 
retaining screen and bombarded three times. Following bombardment, the tissue can be 
divided in half and placed back into liquid and cultured as described above. 

Five to seven days post bombardment, the liquid media may be exchanged with fresh 
media, and eleven to twelve days post bombardment with fresh media containing 50 mg/mL 

25 hygromycin. This selective media can be refreshed weekly. Seven to eight weeks post 
bombardment, green, transformed tissue may be observed growing from untransformed, 
necrotic embryogenic clusters. Isolated green tissue is removed and inoculated into 
individual flasks to generate new, cionally propagated, transformed embryogenic suspension 
cultures. Each new line may be treated as an independent transformation event. These 

30 suspensions can then be subcultured and maintained as clusters of immature embryos or 

regenerated into whole plants by maturation and germination of individual somatic embryos. 

EXAMPLE 9 
Expression of Chimeric Genes in Microbial Cells 
The cDNAs encoding the instant cell cycle regulatory proteins can be inserted into the 

35 T7 E, coll expression vector pBT430. This vector is a derivative of pET-3a (Rosenberg et al. 
(1987) Gene J(J: 125-1 35) which employs the bacteriophage T7 RNA polymerase/T7 
promoter system. Plasmid pBT430 was constructed by first destroying the EcoR I and 
Hind III sites in pET-3a at their original positions. An oligonucleotide adaptor containing 
EcoR 1 and Hind III sites was inserted al the BamH I site of pET-3a. This created pET-3aM 
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with additional unique cloning sites for insertion of genes into the expression vector. Then, 
the Nde I site at the position of translation initiation was converted to an Nco I site using 
oligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM in this region, 
5'-CATATGG, was converted to 5'-CCCATGG in pBT430. 
5 Plasnud DNA containing a cDNA may be appropriately digested to release a nucleic 

acid fragment encoding the protein. This fragment may then be purified on a 1% NuSieve 
GTG™ low melting agarose gel (FMC). Buffer and agarose contain 10 Mg/ml ethidium 
bromide for visualization of the DNA fragment The fragment can then be purified from the 
agarose gel by digestion with GELase™ (Epicentre Technologies) according to the 

10 manufacturer's instructions, ethanol precipitated, dried and resuspended in 20 jiL of water. 
Appropriate oligonucleotide adapters may be ligated to the fragment using T4 DNA ligase 
(New England Biolabs, Beveriy. MA). The fragment containing the ligated adapters can be 
purified from the excess adapters using low melting agarose as described above. The vector 
pBT430 is digested, dephosphorylated with alkaline phosphatase (NEB) and deproteinized 

15 with phenol/chloroform as decribed above. The prepared vector pBT430 and fragment can 
then be ligated at 16«C for 15 hours followed by transformation into DH5 electrocompetent 
cells (GIBCO BRL). Transformants can be selected on agar plates containing LB media and 
100 ^g/mL ampicillin. Transformants containing the gene encoding the cell cycle regulatory 
protein are then screened for the correct orientation with respect to the T7 promoter by 

20 restriction en^me analysis. 

For high level expression, a plasmid clone with the cDNA insert in the correct 
orientation relative to the T7 promoter can be transformed into K coli strain BL21(DE3) 
(Studieretal. (1986)7. MoL Biol. 759:113-130). Cultures are grown in LB medium 
containing ampicillin (100 mg/L) at 25»C. At an optical density at 600 nm of approximately 

25 1 , IPTG (isopropylthio-p-galactoside, the inducer) can be added to a final concentration of 
0.4 mM and incubation can be continued for 3 h at 25°. Cells are then harvested by 
centrifiigation and re-suspended in 50 ^L of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM 
-DTT and 0.2 mM phenyl methylsulfonyl fiuoride. A small amount of 1 mm glass beads can 
be added and the mixture sonicated 3 times for about 5 seconds each time with a microprobe 

30 sonicator. The mixture is centrifiiged and the protein concentration of tiie supernatant 

determined. One \x% of protein from the soluble fraction of the culture can be separated by 
SDS-polyacrylamide gel electrophoresis. Gels can be observed for protein bands migrating 
at the expected molecular weight. 

EXAMPLF. 10 

Evaluating Compounds for Their Ahilitv to Inhibit the Activity 
of Cell Cv cle Regulatory Proteins 
The cell cycle regulatory proteins described herein may be produced using any number 
of methods known to those skilled in the art. Such methods include, but are not limited to, 
expression in bacteria as described in Example 9, or expression in eukaiyotic cell culture, ' 
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inplanta, and using viral expression systems in suitably infected organisms or cell lines. 
The instant cell cycle regulatory proteins may be expressed either as mature forms of the 
proteins as observed in vivo or as fusion proteins by covalent attachment to a variety of 
enzymes, proteins or affinity tags. Common fusion protein partners include glutathione 
5 S-transferase C*GST*)i thioredoxin ("Trx"), maltose binding protein, and C- and/or 

N-terminal hexahistidine polypeptide ("(His)5")- The fusion proteins may be engineered 
with a protease recognition site at the fusion point so that fusion partners can be separated by 
protease digestion to yield intact mature enTyme. Examples of such proteases include 
thrombin, enterokinase and factor Xa. However, any protease can be used which specifically 

10 cleaves the peptide connecting the fusion protein and the enzyme. 

Purification of the instant cell cycle regulatory proteins, if desired, may utilize any 
number of separation technologies familiar to those skilled in the art of protein purification. 
Examples of such methods include, but are not limited to, homogenization, filtration, 
centrifugation, heat denaturation, ammonium sulfate precipitation, desalting, pH 

15 precipitation, ion exchange chromatography, hydrophobic interaction chromatography and 
affinity chromatography, wherein the affinity ligand represents a substrate, substrate analog 
or inhibitor. When the cell cycle regulatory proteins are expressed as fusion proteins, the 
purification protocol may include the use of an affinity resin which is specific for the fusion 
protein tag attached to the expressed enzyme or an affinity resin containing ligands which are 

20 specific for the enzyme. For example, a cell cycle regulatory protein may be expressed as a 
fusion protein coupled to the C-terminus of thioredoxin. In addition, a (His)^ peptide may be 
engineered into the N-terminus of the fiised thioredoxin moiety to afford additional 
opportunities for affinity purification. Other suitable affinity resins could be synthesized by 
linking the appropriate ligands to any suitable resin such as Sepharose-4B. In an alternate 

25 embodiment, a thioredoxin fusion protein may be eluted using dithiothreitol; however, 

elution may be accomplished using other reagents which interact to displace the thioredoxin 
from the resin. These reagents include p-mercaptoethanol or other reduced thiol. The 
eluted fusion protein may be subjected to further purification by traditional means as stated 
above, if desired. Proteolytic cleavage of the thioredoxin fusion protein and the enzyme may 

30 be accomplished after the fusion protein is purified or while the protein is still bound to the 
ThioBond^ affinity resin or other resin. 

Crude, partially purified or purified protein, either alone or as a fusion protein, may be 
utilized in assays for the evaluation of compounds for their ability to inhibit enzymatic 
activition of the cell cycle regulatory proteins disclosed herein. Assays may be conducted 

35 under well known experimental conditions which permit optimal protein activity. For 

example, assays for E2F and DP-1 transcription factor activities are presented by Helin K., 
etal., \99^ Genes Dev, 70:1850-1861 and Ivey-Hoyl M.,etaU 1993 Afo/. Cell Biol 
/J(2;:7802-7812. Assays for DP-2 activity are presented by Zhang Y., el al., 1995 
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Oncogene 10(l J):20SS-2m. Assays for CDC-1 6 activity are presented by Lamb, J. R., 
et al.. 1 994 EMBO J. 13(18) :432 1 -4328. 
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CLAIMS 

What is claimed is: 

1. An isolated nucleic acid fragment encoding all or a substantial portion of a 
GDC-16 protein comprising a member selected from the group consisting of: 

5 (a) an isolated nucleic acid fragment encoding all or a substantial portion of 

the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO:2 and 4; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of 

10 the amino acid sequence set forth in a member selected from the group 

consisting of SEQ ID NO:2 and 4; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

2. The isolated nucleic acid fragment of Claim I wherein the nucleotide sequence 
of the fragment comprises all or a portion of the sequence set forth in a member selected 

15 from the group consisting of SEQ ID NO: 1 and 3. 

3. A chimeric gene comprising the nucleic acid fragment of Claim 1 operably 
linked to suitable regulatory sequences. 

4. A transformed host cell comprising the chimeric gene of Claim 3. 

5. A CDC' 16 polypeptide comprising all or a substantial portion of the amino 

20 acid sequence set forth in a member selected from the group consisting of SEQ ID NO:2 and 
4. 

6. An isolated nucleic acid fragment encoding all or a substantial portion of a 
DP-1 protein comprising a member selected from the group consisting of: 

(a) an isolated nucleic acid fragment encoding all or a substantial portion of 
25 the amino acid sequence set forth in a member selected from the group 

consisting of SEQ ID NO:6, 8 and 10; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding ail or a substantial portion of 
the amino acid sequence set forth in a member selected from the group 

30 consisting of SEQ ID NO:6, 8 and 1 0; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

7. The isolated nucleic acid fragment of Claim 6 wherein the nucleotide sequence 
of the fragment comprises all or a portion of the sequence set forth in a member selected 
from the group consisting of SEQ ID N0:5, 7 and 9. 

35 8. A chimeric gene comprising the nucleic acid fragment of Claim 6 operably 

linked to suitable regulatory sequences. 

9. A transformed host cell comprising the chimeric gene of Claim 8. 
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1 0. A DP- 1 polypeptide comprising all or a substantial portion of the amino acid 
sequence set forth in a member selected from the group consisting of SEQ ID NO-6 8 and 
10. 

11. An isolated nucleic acid fragment encoding all or a substantial portion of a 
DP-2 protein comprising a member selected from the group consisting of: 

(a) an isolated nucleic acid fragment encoding all or a substantial portion of 
the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO: 12 and 14; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of 
the amino acid sequence set forth in a member selected from the gioup 
consisting of SEQ ID NO: 12 and 14; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

1 2. The isolated nucleic acid fragment of Claim 1 1 wherein the nucleotide 

15 sequence of the fragment comprises all or a portion of the sequence set forth in a member 
selected from the group consisting of SEQ ID NO:l 1 and 13. 

13. A chimeric gene comprising the nucleic acid fragment of Claim 1 1 cperably 
linked to suitable regulatory sequences. 

14. A transformed host cell comprising the chimeric gene of Claim 13. 

15. A DP-2 polypeptide comprising all or a substantial portion of the amino acid 
sequence set forth in a member selected from the group consisting of SEQ ID NO; 12 and 14. 

1 6. An isolated nucleic acid fragment encoding all or a substantial portion of a E2F 
protein comprising a member selected from the group consisting of: 

(a) an isolated nucleic acid fragment encoding all or a substantial portion of 
the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO: 16 and 1 8; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of 
the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO: 16 and 18; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

1 7. The isolated nucleic acid fragment of Claim 1 6 wherein the nucleotide 
sequence of the fragment comprises all or a portion of the sequence set forth in a member 
selected from the group consisting of SEQ ID NO: 1 5 and 1 7. 

1 8. A chimeric gene comprising the nucleic acid fragment of Claim 1 6 operably 
linked to suitable regulatory sequences. 

1 9. A transformed host cell comprising the chimeric gene of Claim 1 8. 

20. An E2F polypeptide comprising all or a substantial portion of the amino acid 
sequence set forth in a member selected from the group consisting of SEQ ID NO: 1 6 and 18. 
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21. A method of altering the level of expression of a cell cycle regulatory protein in 
a host cell comprising: 



(a) transforming a host cell with the chimeric gene of any of Claims 3, 8, 
13 and IS; and 

(b) growing the transformed host cell produced in step (a) under conditions 
that are suitable for expression of the chimeric gene 



wherein expression of the chimeric gene results in production of altered levels of a cell cycle 
regulatory protein in the transformed host cell. 

22. A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a cell cycle regulatory protein comprising: 



(a) probing a cDNA or genomic library with the nucleic acid fragment of 
any of Claims U 6, 1 1 and 16 ; 

(b) identifying a DNA clone that hybridizes with the nucleic acid fragment 
of any of Claims I, 6, 1 1 and 16; 

(c) isolating the DNA clone identified in step (b); and 

(d) sequencing the cDNA or genomic fragment that comprises the clone 
isolated in step (c) 



wherein the sequenced nucleic acid fragment encodes all or a substantial portion of the 
amino acid sequence encoding a cell cycle regulatory protein. 

23. A method of obtaining a nucleic acid fragment encoding a substantial portion 
of an amino acid sequence encoding a cell cycle regulatory protein comprising: 



(a) synthesizing an oligonucleotide primer corresponding to a portion of the 
sequence set forth in any of SEQ ID NOs:l, 3, 5, 7, 9, 1 1, 13, 15, and 
17; and 

(b) amplifying a cDNA insert present in a cloning vector using the 
oligonucleotide primer of step (a) and a primer representing sequences 
of the cloning vector 



wherein the amplified nucleic acid fragment encodes a substantial portion of an amino acid 
sequence encoding a cell cycle regulatory protein. 

24. The product of the method of Claim 22. 

25. The product of the method of Claim 23. 

26. A method for evaluating at least one compound for its ability to inhibit the 
activity of a cell cycle regulatory protein, the method comprising the steps of: 



(a) transforming a host cell with a chimeric gene comprising a nucleic acid 
fragment encoding a cell cycle regulatory protein operably linked to 
suitable regulatory sequences; 

(b) growing the transformed host cell under conditions that are suitable for 
expression of the chimeric gene wherein expression of the chimeric 
gene results in production of the cell cycle regulatory protein encoded 
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by the operably linked nucleic acid fragment in the transformed host 
cell; 

(c) optionally purifying the cell cycle regulatory protein expressed by the 
transformed host cell; 

(d) treating the cell cycle regulatory protein with a compound to be tested; 
and 

(e) comparing the activity of the cell cycle regulatory protein that has been 
treated with a test compound to the activity of an untreated cell cycle 
regulatory protein, 

thereby selecting compounds with potential for inhibitory activity. 
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SEQUENCE LISTING 



<110> E. I. DU PONT DE NEMOURS AND COMPANY 

<120> CELL CYCLE REGULATORY PROTEINS 

<130> BB-1161 

<140> 
<141> 

<150> 60/081,132 
<151> APRIL 9, 1998 

<160> 18 

<170> Microsoft Office 97 

<210> 1 

<211> 507 

<212> DNA 

<213> Zea mays 

<220> 

<221> unsure 
<222> (333) 

<220> 

<221> unsure 
<222> (364) 

<220> 

<221> unsure 
<222> (403) 

<220> 

<221> unsure 
<222> (426) 

<220> 

<221> unsure 
<222> (482) 

<400> 1 

gagaccaccg atccgccgtc ggagatgagg gaggaggcgc tggagcggct gcgcggggtg 60 

gtacgggaca gcgccgggaa gcacctotac acgtcggcca tcttcctcgc cgacaaggtg 120 

gcggcggcca cgggggaccc tggcgacgtc tacatgctcg cgcaggcact cttcctcggt 180 

cgtcaattcc gtcacgcgct ccacctcctc aacaattctc gcctgctccg cgacctccgg 240 

ttcagattcc tcgccgccaa gtgcctcgag gagttgaaag agtggcatca gtgtttgttg 300 

atgcttgggg atgcaaaagt ggatgagcat ggnaatgtcc tttgatcatg atgatgacag 360 

tganatttat tttgataagg acgcggaaag atcatgagat cantatcaaa tcagctctat 420 

gtttcntacg tggtaaaggc atatgaagca ctgggacaac cgtgatcttg ctcgccaatg 480 

gnacaaaggc agctgttaaa gccgatg 507 

<210> 2 

<211> 89 

<212> PRT 

<213> Zea mays 

<400> 2 

Leu Glu Arg Leu Arg Gly Val Val Arg Asp Ser Ala Gly Lys His Leu 
15 10 15 
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SUBSTnXJTE SHEET (RULE 26) 



BNSOOCIO; <WO 9953076A^L> 



wo 99/53075 

Tyr Thr Ser Ala lie 
20 



PCT/US99/07d38 

Phe Leu Ala Asp Lys Val Ala Ala Ala Thr Gly 



Asp Pro Gly Asp Val Tyr Met Leu Al 



30 



Ala Gin Ala Leu Phe Leu Gly Arg 
Gin Phe Arg His Ala Leu His Leu Leu Asn Asn Ser Arg Leu Leu Arg 
ASP Leu Arg Phe Arg Phe Leu Ala Ala Lys Cys Leu Glu Glu Leu Lys 



Glu Trp His Gin Cys Leu Leu Met Leu 
85 



<210> 3 

<211> 1093 

<212> DNA 

<213> Glycine max 



<400> 3 

gcacgaggtt 

acgctagtac 

tcatgcaatt 

tactactatt 

ctagatggaa 

gagggagacc 

ttagcaactc 

cagtttttta 

ggagttgttg 

ttagccttgg 

catgcataca 

gcattatcaa 

gatgatttca 

cagttctgca 

cccagtcttg 

taaatgtgta 

ttatcagttg 

tagcaatttt 

aatttttaat 



ttgagttaac 

atctggcagc 

tggtgaagga 

gtatcaagaa 

catttccccc 

aggctatgtc 

tttacattgg 

cgcaggccaa 

cctaccatat 

taccaaccac 

gaaagctgaa 

caagaagtgt 

ccacagcaat 

ctgaaatgtt 

aattccgttg 

acaatgagtg 

tagataatct 

atactagtta 

att 



aaatgatttg 

tgctgtggag 

ttatcctcag 

gtatgatcaa 

tgcatggata 

tgcttaccgc 

aatggaatgc 

gtcaatttgc 

ggaagagtat 

tttatctgaa 

gatgtaccga 

gagcacttat 

tgcatattat 

aagttgggct 

aagtctgatt 

ccgagttgat 

gtattttgtt 

aatattccgt 



cttgagaagg 

cttgggcatt 

atggctttat 

tcccgccgtt 

ggctatggga 

actgcagcta 

atgcggaccc 

tcctcagatc 

aagaaagcag 

atttgggaat 

gaagcaattt 

gccggtcttg 

cacaaagcct 

ctaatagatg 

tgaatatgat 

gtattgccaa 

tctgaataat 

gacttcacag 



atcttttcca 

caaatgaact 

catggtttgc 

attttagcaa 

atgcttatgc 

gattgtttcc 

acagttataa 

cacttgtgta 

tgtggtggtt 

cgactgtagt 

catattatga 

catatactta 

tgtggctgaa 

aaagtcgaag 

ggtgcagcca 

tgtgcaattg 

gtttgatttt 

gtattaatct 



tctaaagact 

ctatctgatg 

tgttggttgc 

ggcaactagt 

tgctcaagaa 

tgggtgccat 

gcttgctgag 

taatgaactt 

tgagaaaaca 

caatcttgca 

gaaagcactt 

ccacctacag 

accagatgat 

aggcgttgac 

tttgcacttg 

ttaattatta 

attatatctc 

taaaatttgt 



<210> 4 

<21l> 264 

<212> PRT 

<213> Glycine max 

<400> 4 

Ph, =!„ ..eu Th. As„ Asp Le„ 

Thr T.. ^„ ^ 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1093 



25 



30 



Glu Leu ryr Leu Met Ser Cys Asn Leu Val Lys Asp ryr Pro Cln Met 

Ala Leu Ser Trp P.e Ala Val Gly Cys Tyr Tyr Tyr Cy^ He Lys Lys 

Tyr ASP Gin Ser Arg Arg Tyr Phe Ser Lys Ala Thr Ser Leu Asp Gly 

'^^ 80 

Thr Phe Pro Pro Ala Trp He Glv Tvr- ri « 

85 ^ ^^"^ Tyr Ala Ala Gin 



90 
2 
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wo 99/53075 




PCTAJS99/07638 



Glu Glu Gly Asp Gin Ala Met Ser Ala Tyr Arg Thr Ala Ala Arg Leu 
100 105 110 

Phe Pro Gly Cys His Leu Ala Thr Leu Tyr lie Gly Met Glu Cys Met 
115 120 125 

Arg Thr His Ser Tyr Lys Leu Ala Glu Gin Phe Phe Thr Gin Ala Lys 
130 135 140 

Ser lie Cys Ser Ser Asp Pro Leu Val Tyr Asn Glu Leu Gly Val Val 
145 150 155 160 

Ala Tyr His Met Glu Glu Tyr Lys Lys Ala Val Trp Trp Phe Glu Lys 
165 170 175 

Thr Leu Ala Leu Val Pro Thr Thr Leu Ser Glu lie Trp Glu Ser Thr 
180 185 190 

Val Val Asn Leu Ala His Ala Tyr Arg Lys Leu Lys Met Tyr Arg Glu 
195 200 205 

Ala lie Ser Tyr Tyr Glu Lys Ala Leu Ala Leu Ser Thr Arg Ser Val 

210 215 220 

Ser Thr Tyr Ala Gly Leu Ala Tyr Thr Tyr His Leu Gin Asp Asp Phe 
225 230 235 240 

Thr Thr Ala lie Ala Tyr Tyr His Lys Ala Leu Trp Leu Lys Pro Asp 
245 250 255 

Asp Gin Phe Cys Thr Glu Met Leu 
260 

<210> 5 
<211> 665 
<212> DNA 

<213> Impatiens balsamia 
<220> 

<221> unsure 
<222> (199) 

<220> 

<221> unsure 
<222> (414) 

<220> 

<221> unsure 
<222> (416) 

<220> 

<221> unsure 
<222> (478) 

<220> 

<221> unsure 
<222> (519) 

<220> 

<221> unsure 
<222> (548) 



3 



SUBSTITUTE SHEET (RULE 26) 
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<220> 



<221> unsure 

<222> (554) 

<220> 

<221> unsure 

<222> (569) 

<220> 

<221> unsure 

<222> (578) 

<220> 

<221> unsure 

<222> (598) 

<220> 

<221> unsure 

<222> (611) 

<220> 

<221> unsure 

<222> (617) . . (618) 

<220> 

<221> unsure 

<222> (632) 

<220> 

<221> unsure 

<222> (635) 

<220> 

<221> unsure 

<222> (637) 

<220> 

<221> unsure 

<222> (647) 

<220> 

<221> unsure 

<222> (657) 

<400> 5 



<210> 6 

<2ll> 138 

<212> PRT 

<213> Impatiens balsamia 



agaagaaggg 
aaaaaggaaa 
aaggcagagc 
cttgaggaac 
tcagggaaca 
catgcaaccg 
agcactcctt 
ataaacgatg 
agccaacgtc 
tcgtttgnca 
gatatctggt 
gattg 




atccaaggac 60 
tgaagagata 120 
tttgcaagaa 180 
gttatatggc 240 
gacacgaccc 300 
cgacttcaac 360 
attnancgga 4 20 
aagtattnca 480 
tagaatttgt 540 
tccgttantt 600 
accttancag 660 



665 



SUBSTITUTE SHEET (RULE 26) 



wo 99/53075 PCTAJS99/07638 
<220> 

<221> UNSURE 
<222> (67) 

<220> 

<221> UNSURE 
<222> (86)., (87) 

<400> 6 

Arg Arg Arg Val Tyr Asp Ala Leu Asn Val Leu Met Ala Met Asp lie 
1 5 10 . 15 

lie Ser Lys Asp Lys Lys Glu lie Gin Trp Lys Gly Leu Pro Arg Thr 
20 25 30 

Ser Leu Asn Asp lie Glu Glu lie Lys Ala Glu Arg Leu Glv Leu Ara 
35 40 45 

Ser Arg lie Asp Lys Lys Thr Ala Tyr Leu Gin Glu Leu Glu Glu His 
50 55 60 

Tyr Val Xaa Leu Gin Asn Leu Val Gin Arg Asn Glu Arg Leu Tyr Gly 
65 70 75 80 

Ser Gly Asn Met Pro Xaa Xaa Pro Thr Gly Gly Val Ala Leu Pro Phe 
85 90 95 

He Leu Val Gin Thr Arg Pro His Ala Thr Val Glu lie Glu He Ser 
100 105 110 

Glu Asp Met Gin Leu Val His Phe Asp Phe Asn Ser Thr Phe Glu Leu . 
115 120 125 

His Asp Asp Asn Tyr Val Met Arg Ala Met 
130 135 

<210> 7 

<211> 296 

<212> DNA 

<213> Zea mays 

<400> 7 

gatattgaag aattgaagac tgagcttgtg ggactgaaag gtagaattga gaagaaaagt 60 

gtttacttac aggagctaca agatcaatat gtaggtttgc aaaacctgat tcaacgaaat 120 

gagcaattat atggttcagg aaacacacct tctggtggag tggctttgcc attcatccta 180 

gtccagaccc gacctcatgc aaccgtggaa gtt-gagatat cagaagatat gcagctggtt 240 

cattttgact tcaatagcac tccatttgag ctgcatgatg actcatatgt cctaaa 296 

<210> 8 

<211> 100 

<212> PRT 

<213> Zea mays 

<220> 

<221> UNSURE 
<222> (51).. (52) 

<400> 8 

Asp He Glu Glu Leu Lys Thr Glu Leu Val Gly Leu Lys Gly Arg He 
15 10 15 

Glu Lys Lys Ser Val Tyr Leu Gin Glu Leu Gin Asp Gin Tyr Val Gly 
20 25 30 
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wo 99/53075 M 


Leu 


Gin Asn 
35 


Leu 


Thr 


Pro Xaa 
50 


Xaa 


65 


Thr Arg 


Pro 


Gin 


Leu Val 


His 


Ser 


Tyr Val 


Leu 
100 



PCT/US99/07638 

Glu Gin Leu Tyr Gly Ser Gly Asn 
40 45 

er Gly Gly Val Ala Leu Pro Phe lie Leu Val 
55 60 

la Thr Val Glu Val Glu lie Ser Glu Asp Met 
° 80 

_ ^ A 

90 95 



Gin Leu Val His Phe Asp Phe Asn Ser Thr Phe Glu Leu His Asp Asp 



<210> 9 

<211> 481 

<212> DNA 

<213> Glycine max 

<220> 

<221> unsure 

<222> (37) 



<220> 

<221> unsure 
<222> (82) 

<220> 

<221> unsure 

<222> (85) 

<220> 

<221> unsure 

<222> (110) 

<220> 

<221> unsure 

<222> (112) 

<220> 

<221> unisure 

<222> (118) 

<22b> 

<221> unsure 

<222> (153) 

<220> 

<221> unsure 

<222> (155) 

<220> 

<221> unsure 

<222> (169) 

<220> 

<221> unsure 

<222> (171) 

<220> 

<221> unsure 

<222> (183) 
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wo 99/53075 




PCTAJS99/07638 



<220> 

<221> unsure 

<222> (195) 

<220> 

<221> unsure 

<222> (214) 

<220> 

<221> unsure 

<222> (219) 

<220> 

<221> unsure 

<222> (224) 

<220> 

<221> unsure 

<222> (247) 

<220> 

<221> unsure 

<222> (265) 

<220> 

<221> unsure 

<222> (283) 

<220> 

<221> unsure 

<222> (289) 

<220> 

<221> unsure 

<222> (301) 

<220> 

<221> unsure 

<222> (315) . . (316) 

<220> 

<221> unsure 

<222> (323) 

<220> 

<221> unsure 

<222> (359) 

<220> 

<221> unsure 

<222> (361) 

<220> 

<221> unsure 

<222> (376) 

<220> 

<221> unsure 

<222> (407) 

<220> 

<221> unsure 

<222> (426) . . (427) . . (428) 
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<220> 

<221> unsure 
<222> (445) 

<220> 

<221> unsure 
<222> (448) 

<220> 

<221> unsure 
<222> (457) 

<220> 

<221> unsure 
<222> (460) 

<400> 9 

agcaacaata tgatgagaaa aacatccgcc gaagggncga tgatgctctg aacgttctca 60 

tggcaatgga tattatttct antgncaaaa aaaaaattca atggaggggn cntcctcnca 120 

ctactgtgaa tgatattgaa gaactaaaga canancgtct tgggctcang natagaattg 180 

aanagaaaac aaccnatctg cacgagcttg aggngcaant cgtntgtctt cacaacctta 24 0 

ttcaacnaaa tgagcaagtt atatngctca agaaatcctc ccncccggng ggggggcctt 300 

nccttttttt tggtnnagaa aantccccat gcaactgtgg ggggggggaa tatcaaaana 360 

natgcaacct ggtcantttg atttcaataa aaaacccttt ttttgcntgg cgaaaattaa 420 

tttttnnncg caagggaatt ttgtntgnga ccactgnccn gggtaatatg acacaaaaac 4 80 



<210> 10 

<211> 83 

<212> PRT 

<213> Glycine max 

<220> 

<221> UNSURE 

<222> (10) 

<220> 

<221> UNSURE 

<222> (25) . . (26) 

<220> 

<221> UNSURE 

<222> (35) 

<220> 

<221> UNSURE 

<222> (37) 

<220> 

<221> UNSURE 

<222> (49) 

<220> 

<221> UNSURE 

<222> (54).. (55) 

<220> 

<221> UNSURE 

<222> (59) 

<220> 

<221> UNSURE 

<222> (63) 



c 



481 
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<220> 

<221> UNSURE 
<222> (69) 

<220> 

<221> UNSURE 
<222> (71) 

<220> 

<221> UNSURE 
<222> (80) 

<400> 10 

Tyr Asp Glu Lys Asn He Arg Arg Arg Xaa Asp Asp Ala Leu Asn Val 
IS 10 15 

Leu Met- Ala Met Asp He He Ser Xaa Xaa Lys Lys Lys He Gin Trp 
20 25 30 

Arg Gly Xaa Pro Xaa Thr Thr Val Asn Asp He Glu Glu Leu Lys Thr 
35 40 45 

Xaa Arg Leu Gly Leu Xaa Xaa Arg He Glu Xaa Lys Thr Thr Xaa Leu 

50 55 60 

His Glu Leu Glu Xaa Gin Xaa Val Cys Leu His Asn Leu He Gin Xaa 
^5 ''O 75 80 

Asn Glu Gin 



<210> 11 
<211> 1193 
<212> DNA 
<213> Zea mays 

<220> 

<221> unsure 
<222> (238) 

<400> 11 

gcgacagcac gttcctccgc ttgaataatc tcgacatcaa cggcgacgac gcgccgtcgt 60 

cgcaggctcc tacgagcaag aagaaaagga gaggcacacg ggcagtgggt cctgataaag 120 

gtaaccgggg actgcgccag tttagtatga aagtttgtga gaaagttgaa agtaaaggga 180 

gaacaacata taatgaggtg gcagatgaac ttgttgctga gtttacagac cccaacanta 240 

atattgaggc accagatcct gataacccta acgcgeaaca atatgatgag aaaaatatac 300 

gagggcgagt ttatgatgct ttaaatgttc tgatggctat ggacattata tctaaagata 360 

aaaaggagat ccagtggaag ggcttgccgc ggactagtat aagtgatatt gaagaattga 420 

agactgagct tgtgggactg aaaggtagaa ttgagaagaa aagtgtttac ttacaggagc 480 

tacaagatca atatgtaggt ttgcaaaacc tgattcaacg aaatgagcaa ttatatggtt 540 

caggaaacac accttctggt ggagtggctt tgccattcat cctagtccag acccgacctc 600 

atgctaccgt ggaagttgag atatcagaag atatgcagct ggtgcatttt gacttcaata 660 

gcaccccatt cgagctgcac gacgactcat acgtcctaaa agaaatgcga ttctgtggaa 720 

gagaacaaca tgacagcact caagagtcga tatcaaatgg aggtgagagc tcaagcgtgt 780 

caaatattta ttggcaacac gtacagcatg tggaaaggcc aaacaatggc acaggtaggt 84 0 

taccgagctc accgcctatt ccagggatct tgaaagggcg tgcgaagcac gagcactaac 900 

gctagggtgt tggttcactt tccttttcgt ctgagcagtt ggttttattg catttctccg 960 

ttgtgtaaag ggccctgtaa attattaggc aagcgggagc gtagcttgat ctaatttagc 1020 

tctgcaccag attggtagaa cgacgggtgc tctaagtagt tgtgtaacta taactatcct 1080 

tgaatcggtt tctttagtat ggttgagaga agggttgaca tgtaattttg tagagccttg 1140 

taaaattaaa attgttgatt tctatcaggg tgaaatttct gggcacaaaa aaa 1193 

<210> 12 
<211> 194 
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<212> 


PRT 


<213> 


Zea mays 


<220> 




<221> 


UNSURE 


<222> 


(42) 


<400> 


12 


Asp Lys Gly Asn 



1 5 10 



15 



Lys Val Glu Ser Lys Gly Arg Thr Thr Tyr Asn Glu Val Ala Asp Glu 
20 25 30 

Leu Val Ala Glu Phe Thr Asp Pro Asn Xaa Asn He Glu Ala Asn Ala 



35 40 

Gin Gin Tyr Asp Glu Lys Asn He Arg Gly Arg Val Tyr Asp Ala Leu 
^" ■ 55 60 

Asn Val Leu Met Ala Met Asp He He Ser Lys Asp Lys Lys Glu He 
" ^0 75 80 

Gin Trp Lys Gly Leu Pro Arg Thr Ser He Ser Asp He Glu Glu Leu 
85 90 95 

Lys Thr Glu Leu Val Gly Leu Lys Gly Arg He Glu Lys Lys Ser Val 
100 105 110 

Tyr Leu Gin Glu Leu Gin Asp Gin Tyr Val Gly Leu Gin Asn Leu He 

120 125 

Gin Arg Asn Gin Arg Asn Glu Gin Leu Tyr Gly Ser Gly Asn Thr Pro 

135 140 

Ser Gly Gly Val Ala Leu Pro Phe He Leu Val Gin Thr Arg Pro His 

155 160 

Ala Thr Val Glu Val Glu He Ser Glu Asp Met Gin Leu Val His Phe 

170 

Asp Phe Asn Ser Thr Phe Glu Leu His Asp Asp Ser Tyr Val Leu Lys 

185 ^<xn ^ 



Glu Met 



<210> 13 

<211> 698 

<212> DNA 

<213> Triticum aestivum 

<400> 13 



gcacgaggtt tgagtgatat tgataaattg aagactgagg tcattgggct gaaaggtaaa 60 

attgacaaga aaagtgcata tctgcaggaa ttacaagatc aatatjlggg S?c2laa!? 120 

ct^^cattca HlHT.r' gctatatggt tcgggagatg ctcca?c??? cggagtggic \\o 

ctgccattca tattggttca gacacgtcct catgcaactg tcgaagtgga gatatcaaaa 240 

gatatgcagt tggtgcattt tgatttcaat agcactccgl ttgag?t??a ?gatgat?cc llo 

tttgtattga aagcaatggg gttctctggt aaagaagaaa ctgacggtac agtggcJcfq 36^ 

gttgcaaatg cggttgaatg ctcaagtgca tcaaatgttt atgggc|tcg a£cllcacaa 420 

oo^^f^^^"" caaatggaat taggctacga acctcacctc ctl??clagg gatactgaaa \lo 

IValllllll lllMlt^"'' '^'^^^gcacta aaatgggttg ttaatgtt?? gagagc?act Wo 

tggatttata ctcccttggc ttctgtaact taacatgtaa aaaggcttgt LIttatgtc 600 
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catggggaaa tgtgatgttc taatttagct ttgcagccga tggtaggctg atcggcatgt 660 
gaagctccag gcctgatttt aacttttagc tcaaaaaa 698 

<210> 14 

<211> 92 

<212> PRT 

<213> Triticum aestivum 



<400> 14 

Arg He Asp Lys Lys Ser Ala Tyr 
1 5 

Ala Gly Leu Gin Asn Leu Val Glu 
20 

Tyr Gly Ser Gly Asp Ala Pro Ser 

35 40 

Leu Val Gin Thr Arg Pro His Ala 

50 55 

Asp Met Gin Leu Val His Phe Asp 
65 70 

Asp Asp Ser Phe Val Leu Lys Ala 
85 



Leu Gin Glu Leu Gin Asp Gin Tyr 
10 15 

Arg Asn Glu Arg Asn Glu Gin Leu 
25 30 

Gly Gly Val Ala Leu Pro Phe lie 
45 

Thr Val Glu Val Glu He Ser Glu 

60 

Phe Asn Ser Thr Phe Glu Leu His 

75 80 

Met Gly Phe Ser 
90 



<210> 


15 


<211> 


540 


<212> 


DNA 


<213> 


Oryza . 


<220> 




<221> 


unsure 


<222> 


(413) 


<220> 




<221> 


unsure 


<222> 


(461) 


<220> 




<221> 


unsure 


<222> 


(471) 


<220> 




<221> 


unsure 


<222> 


(498) 


<220> 




<221> 


unsure 


<222> 


(517) 



<400> 15 

ctgaagatga 

gcgcctcatg 

aggagatata 

caatttgagg 

gattctctag 

caagccaaat 

gatattggcg 

actgggctcc 

aagcaacaaa 



catcaagtct 
gtacaacttt 
ggattgttct 
agatgagtgg 
agaatcccag 
attcaagatg 
ggatgattga 
ttatcaaaat 
aaggtggngg 



ttgccctgct 
ggaggtccca 
aaggagtact 
catggagact 
gacgccattg 
ggctgctaat 
agaattgtcc 
tgccgggggg 
tggggaaagg 



tccaagaatc 
gatcctgatg 
atgggtccaa 
cctccaagga 
gctgcagaac 
ggccttctga 
cttcagaaac 
ttagcaatta 
gaatccnaaa 



agacactgat 
aagtgaatga 
tagatgtcta 
ctgtgcagcc 
ccaacaaagc 
tgctccttct 
tttgataccg 
nccgacatgt 
aaattcaatt 



cgcaatcaaa 
ttatccacag 
cctagttagt 
agtaagcatg 
tgcagagtca 
agttcacaag 
atngcagact 
ngggaagaac 
gcaagagggg 



60 
120 
180 
240 
300 
360 
420 
480 
540 



II 
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<210> 
<211> 
<212> 
<213> 

<220> 
<221> 
<222> 



16 
53 
PRT 

Oryza sativa 

UNSURE 
(31) 



<400> 16 

Leu Cys Pro Ala Ser Lys Asn Gin Thr Leu lie Ala He Lys Ala Pro 
15 10 15 

His Gly Thr Thr Leu Glu Val Pro Asp Pro Asp Glu Val Asn Xaa Gin 
2^ 25 30 

Arg Arg Tyr Arg He Val Leu Arg Ser Thr Met Gly Pro He Asp Val 
35 40 45 

Tyr Leu Val Ser Gin 
50 



<210> 17 

<211> 1600 

<212> DNA 

<213> Glycine max 



<400> 17 

gtacgagcga 

acttacaaca 

aaccgagaca 

agacggcgga 

gcaaaaaatc 

aaggaagagg 

gtttccgacg 

agtgacaagt 

gaaaaatctc 

gaattgatct 

gtaatgagaa 

cttattgaga 

gaagggaaga 

gcgtttggaa 

^9tggggact 

gaggcagacg 

ggtcctttcg 

caggtgcatg 

ttgagagaac 

aggaagaggc 

gatttagtaa 

gtaaacttaa 

tatctgtagt 

tcaaccgtga 

tttgtttgtt 

cctgtactgc 

aaaaaaaaaa 



atcttcttca 
gaaaacaaaa 
ccgttcactt 
tctatgacat 
aatacacatg 
ggttgaagga 
acgaagatga 
tgaaccctaa 
tggctctgct 
cccttgacga 
ctaaagttag 
agacccatac 
catgggatga 
gtgatatcac 
tgaatccgaa 
aaaacaattt 
ctccagcttg 
actggggcag 
tcttctccca 
cattacaaat 
aatttagtct 
gtttgcaaat 
atgcattctc 
tgttgaccga 
gtatagtttc 
agtctggagc 
aaaaaaaaaa 



ttacatggct 
atctctaggc 
gattggccta 
tgtgaatgtc 
gagagggttt 
gaattctaat 
cgaggaaaca 
ttctactctt 
tacccagaat 
ggctgcaaaa 
acgcctatat 
aatggataca 
gacacrtcac 
taacatcagt 
tcctaagaag 
aaaacaaggc 
cgtgcccaaa 
tcttgcaact 
ttacatggaa 
tttgtaaacg 
ggaaacttgg 
tttaatccaa 
ttcatgcgtt 
ctgaatcagc 
taccctttat 
tgctaattta 
aaaaaaaaaa 



tcttcagatc 
cttctctgca 
gacgatgctg 
ctggagagta 
gccgcaatcc 
tccctacgtg 
cagtccaatc 
cctaaacctc 
tttgtcaagc 
ttgctacttg 
gacatcgcaa 
cgaaaaccag 
aaatcaaacc 
tttgagagga 
ccaagaatgg 
ataaaacaag 
gttggagcct 
gcacatagcc 
gcatggaaat 
tatgccaaag 
tttcagagaa 
acatggggtg 
tgtagagctg 
ttaggcagtt 
ttaatgtagc 
aggaaaataa 
aaaaaaaaaa 



<210> 18 

<211> 80 

<212> PRT 

<213> Glycine max 

<400> 18 



ccatctcttc 
ctaatttctt 
ccacccgatt 
tcggggtact 
ctctcactct 
ggcctggaaa 
ccgccgctac 
tgaaaaatga 
tctttgtctg 
gggatgccca 
atgtgctatc 
cattcaggtg 
taaacgactc 
ataaagtgga 
aaaatggtag 
cttcaaagag 
ctcagaataa 
ctcagtatca 
tgtggtactc 
tcaaactata 
gcaagacttg 
taactttcct 
agcttccatt 
cttgcaaatt 
aggaaatctt 
aattcgagtt 



tcgccactac 
gagcttgtac 
aggtgttgag 
ctcaaggaaa 
acaggatctc 
ccatgataaa 
tggaagtcaa 
aaatcgaaga 
ctctaatgtg 
taatacgtct 
ctccatgaac 
gctcggttct 
taggaaaagg 
attgttcacg 
tgggctgggt 
ctatgaattt 
taatatgaag 
aaatgaagcc 
agaaattgct 
ggtatgtttg 
ggatactagt 
cactagagaa 
tactgccttt 
atttatcaca 
gtgcactact 
gttttaaaaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1600 



Met Ala Ser Ser Asp Pro He Ser Ser Arg His Tyr Thr Tyr Asn Arg 
5 10 15 ^ 
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Lys Gin Lys Ser Leu Gly Leu Leu Cys 

20 25 

Asn Arg Asp Thr Val His Leu lie Gly 
35 40 

Leu Gly Val Glu Arg Arg Arg lie Tyr 
50 55 

Ser lie Gly Val Leu Ser Arg Lys Ala 
65 70 
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Thr Asn Phe Leu Ser Leu Tyr 
30 

Leu Asp Asp Ala Ala Thr Arg 
45 

Asp lie Val Asn Val Leu Glu 
60 

Lys Asn Gin Tyr Thr Trp Arg 
75 80 
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